Microservices Communication Patterns: gRPC vs REST, Message Queues, Sagas, and Circuit Breakers
Free DevOps Audit Checklist
Get our comprehensive checklist to identify gaps in your infrastructure, security, and deployment processes
Introduction
The moment you split a monolith into microservices, communication becomes your biggest challenge. In a monolith, a function call is nanoseconds, type-safe, and transactional. In microservices, every interaction is a network call that can fail, timeout, arrive out of order, or silently duplicate. The communication patterns you choose determine your system's reliability, latency, and operational complexity.
This guide covers the practical decisions you face when designing microservice communication: synchronous vs asynchronous, gRPC vs REST, choosing the right message broker, implementing the saga pattern for distributed transactions, and building circuit breakers to prevent cascade failures.
Synchronous vs Asynchronous Communication
The first architectural decision is whether services communicate synchronously (request-response) or asynchronously (event-driven). Most systems need both.
Synchronous (request-response): Service A sends a request to Service B and waits for a response. Use for operations where the caller needs an immediate answer.
User → API Gateway → Order Service → Inventory Service (check stock)
← stock confirmed
← order created
← 201 Created
Asynchronous (event-driven): Service A publishes an event and moves on. Other services consume the event whenever they are ready. Use for operations where the caller does not need an immediate answer, or when you want to decouple services.
Order Service → publishes "OrderCreated" event → Message Queue
├→ Payment Service (processes payment)
├→ Notification Service (sends email)
└→ Analytics Service (records metric)
When to use synchronous:
- User-facing requests that need an immediate response
- Data reads where you need the latest state
- Simple request-response queries between two services
When to use asynchronous:
- Operations that can happen in the background (email, analytics, audit logs)
- When you need to fan out to multiple consumers
- When services have different availability requirements
- When you need to handle traffic spikes (the queue acts as a buffer)
The biggest mistake teams make is going 100% synchronous. If your order service synchronously calls payment, inventory, notification, and analytics services in sequence, a slowdown in any one of them degrades the entire order flow. Move non-critical steps to async processing.
Need DevOps help?
InstaDevOps provides expert DevOps engineering starting at $2,999/mo. Skip the hiring headache.
Book a free 15-min call →gRPC vs REST
For synchronous service-to-service communication, you are choosing between REST (HTTP/JSON) and gRPC (HTTP/2, Protocol Buffers).
REST is the default because it is simple, well-understood, and debuggable with curl:
// Express.js REST endpoint
app.get('/api/inventory/:productId', async (req, res) => {
const stock = await db.query(
'SELECT quantity FROM inventory WHERE product_id = $1',
[req.params.productId]
);
res.json({ productId: req.params.productId, quantity: stock.rows[0]?.quantity || 0 });
});
// Calling it from another service
const response = await fetch('http://inventory-service:3000/api/inventory/SKU-123');
const data = await response.json();
gRPC uses Protocol Buffers for serialization and HTTP/2 for transport, giving you strong typing, smaller payloads, and bidirectional streaming:
// inventory.proto
syntax = "proto3";
package inventory;
service InventoryService {
rpc CheckStock (StockRequest) returns (StockResponse);
rpc WatchStock (StockRequest) returns (stream StockUpdate); // Server streaming
}
message StockRequest {
string product_id = 1;
}
message StockResponse {
string product_id = 1;
int32 quantity = 2;
bool in_stock = 3;
}
message StockUpdate {
string product_id = 1;
int32 quantity = 2;
string timestamp = 3;
}
// gRPC server (Node.js)
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');
const packageDef = protoLoader.loadSync('inventory.proto');
const proto = grpc.loadPackageDefinition(packageDef);
const server = new grpc.Server();
server.addService(proto.inventory.InventoryService.service, {
checkStock: async (call, callback) => {
const stock = await db.query(
'SELECT quantity FROM inventory WHERE product_id = $1',
[call.request.product_id]
);
callback(null, {
product_id: call.request.product_id,
quantity: stock.rows[0]?.quantity || 0,
in_stock: (stock.rows[0]?.quantity || 0) > 0,
});
},
});
server.bindAsync('0.0.0.0:50051', grpc.ServerCredentials.createInsecure(), () => {
console.log('gRPC server running on port 50051');
});
Choose REST when:
- External-facing APIs (browsers, mobile apps, third parties)
- Simple CRUD operations
- You want maximum debuggability
- Team is not familiar with protobuf
Choose gRPC when:
- Internal service-to-service communication with high call volume
- You need streaming (real-time feeds, file uploads)
- Latency-sensitive paths where the ~30% serialization speedup matters
- You want strict API contracts enforced at compile time
In practice, many teams use REST for their public API and gRPC for internal service mesh communication.
Choosing a Message Broker
For asynchronous communication, you need a message broker. The three most common choices are SQS, RabbitMQ, and Kafka, and they serve different use cases.
Amazon SQS - Managed queue, simplest to operate, no infrastructure to manage:
const { SQSClient, SendMessageCommand, ReceiveMessageCommand, DeleteMessageCommand } = require('@aws-sdk/client-sqs');
const sqs = new SQSClient({ region: 'us-east-1' });
const QUEUE_URL = 'https://sqs.us-east-1.amazonaws.com/123456789/order-events';
// Producer: publish event
await sqs.send(new SendMessageCommand({
QueueUrl: QUEUE_URL,
MessageBody: JSON.stringify({
eventType: 'OrderCreated',
orderId: 'ord_abc123',
userId: 'usr_456',
total: 99.99,
timestamp: new Date().toISOString(),
}),
MessageGroupId: 'ord_abc123', // FIFO queue: ensures ordering per order
MessageDeduplicationId: `order-created-ord_abc123`, // Prevents duplicates
}));
// Consumer: poll for events
async function pollMessages() {
while (true) {
const response = await sqs.send(new ReceiveMessageCommand({
QueueUrl: QUEUE_URL,
MaxNumberOfMessages: 10,
WaitTimeSeconds: 20, // Long polling
VisibilityTimeout: 60,
}));
for (const message of response.Messages || []) {
const event = JSON.parse(message.Body);
await processEvent(event);
await sqs.send(new DeleteMessageCommand({
QueueUrl: QUEUE_URL,
ReceiptHandle: message.ReceiptHandle,
}));
}
}
}
RabbitMQ - Feature-rich broker with flexible routing (exchanges, bindings, dead letter queues):
Best when you need complex routing patterns (topic-based, headers-based), priority queues, or delayed messages. More operational overhead than SQS since you manage the broker yourself.
Apache Kafka - Distributed log for high-throughput event streaming:
Best when you need event replay (consumers can re-read history), very high throughput (millions of events/second), or multiple consumer groups reading the same stream independently. Most complex to operate.
Decision matrix:
| Requirement | SQS | RabbitMQ | Kafka |
|---|---|---|---|
| Zero ops overhead | Yes | No | No |
| Message ordering | FIFO queues | Per queue | Per partition |
| Event replay | No | No | Yes |
| Complex routing | No | Yes | No (use streams) |
| Throughput | High | Medium | Very high |
| Latency | ~20ms | ~1ms | ~5ms |
| Cost at low volume | Cheapest | Moderate | Expensive |
For most startups: start with SQS. You can always migrate to Kafka later when you actually need event replay or million-message-per-second throughput.
The Saga Pattern for Distributed Transactions
In a monolith, creating an order means a single database transaction: deduct inventory, charge payment, create order - all or nothing. In microservices, each step is a different service with its own database. You cannot use a distributed transaction (2PC) because it does not scale and couples all services together.
The saga pattern breaks a distributed transaction into a sequence of local transactions, each publishing an event that triggers the next step. If any step fails, compensating transactions undo the previous steps.
Choreography-based saga (each service reacts to events):
OrderService PaymentService InventoryService
│ │ │
├─ Create order (PENDING) │ │
├─ Publish "OrderCreated" ──►│ │
│ ├─ Charge payment │
│ ├─ Publish "PaymentCharged" ─►│
│ │ ├─ Reserve stock
│ │ ├─ Publish "StockReserved"
│◄───────────────────────────┼────────────────────────────┤
├─ Update order → CONFIRMED │ │
│ │ │
│ --- If payment fails --- │ │
│ ├─ Publish "PaymentFailed" ──►│
│◄───────────────────────────┤ ├─ Release stock
├─ Update order → CANCELLED │ │
Orchestration-based saga (a central coordinator manages the flow):
// saga-orchestrator.js
class CreateOrderSaga {
constructor(orderService, paymentService, inventoryService) {
this.steps = [
{
execute: (data) => inventoryService.reserveStock(data.items),
compensate: (data) => inventoryService.releaseStock(data.items),
},
{
execute: (data) => paymentService.charge(data.userId, data.total),
compensate: (data) => paymentService.refund(data.paymentId),
},
{
execute: (data) => orderService.confirmOrder(data.orderId),
compensate: (data) => orderService.cancelOrder(data.orderId),
},
];
}
async execute(orderData) {
const completedSteps = [];
for (const step of this.steps) {
try {
const result = await step.execute(orderData);
orderData = { ...orderData, ...result };
completedSteps.push(step);
} catch (error) {
console.error(`Saga step failed: ${error.message}. Compensating...`);
// Compensate in reverse order
for (const completed of completedSteps.reverse()) {
try {
await completed.compensate(orderData);
} catch (compError) {
console.error(`Compensation failed: ${compError.message}`);
// Alert on-call - manual intervention needed
}
}
throw new Error(`Order saga failed: ${error.message}`);
}
}
return orderData;
}
}
Choreography vs orchestration: Use choreography when you have 2-3 services in the saga and the flow is simple. Use orchestration when you have 4+ services or complex branching logic. Orchestration is easier to understand, debug, and monitor.
Circuit Breakers
When a downstream service is failing, you do not want every request to wait for a timeout. A circuit breaker detects failures and short-circuits requests to the failing service, returning an error immediately or falling back to a cached response.
// circuit-breaker.js
class CircuitBreaker {
constructor(options = {}) {
this.failureThreshold = options.failureThreshold || 5;
this.resetTimeout = options.resetTimeout || 30000; // 30 seconds
this.failureCount = 0;
this.lastFailureTime = null;
this.state = 'CLOSED'; // CLOSED = normal, OPEN = failing, HALF_OPEN = testing
}
async execute(fn, fallback) {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailureTime > this.resetTimeout) {
this.state = 'HALF_OPEN';
} else {
if (fallback) return fallback();
throw new Error('Circuit breaker is OPEN');
}
}
try {
const result = await fn();
if (this.state === 'HALF_OPEN') {
this.state = 'CLOSED';
this.failureCount = 0;
}
return result;
} catch (error) {
this.failureCount++;
this.lastFailureTime = Date.now();
if (this.failureCount >= this.failureThreshold) {
this.state = 'OPEN';
console.warn('Circuit breaker tripped to OPEN');
}
if (fallback) return fallback();
throw error;
}
}
}
// Usage
const inventoryBreaker = new CircuitBreaker({
failureThreshold: 3,
resetTimeout: 15000,
});
async function checkStock(productId) {
return inventoryBreaker.execute(
// Primary: call inventory service
() => fetch(`http://inventory-service:3000/api/stock/${productId}`).then(r => {
if (!r.ok) throw new Error(`Inventory service returned ${r.status}`);
return r.json();
}),
// Fallback: return cached/default value
() => ({ productId, quantity: null, in_stock: true, source: 'fallback' })
);
}
In production, use a library like opossum for Node.js or rely on your service mesh (Istio, Linkerd) to handle circuit breaking at the infrastructure level:
# Istio DestinationRule with circuit breaking
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: inventory-service
spec:
host: inventory-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
h2UpgradePolicy: DEFAULT
http1MaxPendingRequests: 50
http2MaxRequests: 100
maxRequestsPerConnection: 10
outlierDetection:
consecutive5xxErrors: 3
interval: 10s
baseEjectionTime: 30s
maxEjectionPercent: 50
Observability for Distributed Communication
Debugging microservice communication without proper observability is like debugging a distributed system blindfolded. Implement these three pillars:
Distributed tracing - propagate trace IDs across service boundaries:
// Using OpenTelemetry
const { trace, context, propagation } = require('@opentelemetry/api');
// When making an outbound request, propagate context
async function callService(url, data) {
const headers = {};
propagation.inject(context.active(), headers);
return fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
...headers, // Includes traceparent header
},
body: JSON.stringify(data),
});
}
Structured logging - include correlation IDs in every log line:
// Every log entry includes the trace/request ID
logger.info('Processing order', {
traceId: span.spanContext().traceId,
orderId: order.id,
service: 'order-service',
action: 'create_order',
duration_ms: Date.now() - startTime,
});
Metrics - track request rate, error rate, and latency (the RED method):
// Prometheus metrics
const httpRequestDuration = new promClient.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests',
labelNames: ['method', 'route', 'status', 'target_service'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 5],
});
Need Help with Your DevOps?
Designing and implementing microservices communication patterns - from choosing the right broker to implementing sagas and circuit breakers - requires deep infrastructure expertise. At InstaDevOps, we help startups and SMBs build reliable, scalable microservice architectures - starting at $2,999/mo.
Book a free 15-minute consultation to discuss your microservices architecture and communication challenges.
Ready to Transform Your DevOps?
Get started with InstaDevOps and experience world-class DevOps services.
Book a Free CallNever Miss an Update
Get the latest DevOps insights, tutorials, and best practices delivered straight to your inbox. Join 500+ engineers leveling up their DevOps skills.