Microservices Architecture β How Systems Communicate
π What Problem Are We Solving?
- A monolith is one giant codebase β one process, one database, deployed as a single unit
- Works fine when small, but as the app grows:
- A bug in the payments module can crash the entire app
- Deploying a change to one feature requires redeploying everything
- Multiple teams can't work independently without constantly blocking each other
Microservices is the solution:
Split the system into small, independent services β each owning one responsibility, deployed separately, communicating over a network.
πΊοΈ The Big Picture
A typical microservices system looks like this:

- Each box is a separate deployable service with its own database, codebase, and team
- Services talk to each other either directly (sync) or via a message broker (async)
- The API Gateway is the single entry point β it routes requests to the right service
β‘ The Two Fundamental Communication Styles
Every service-to-service call is either synchronous or asynchronous. Choosing between them is one of the most important architectural decisions you'll make.
1. Synchronous Communication
- Service A sends a request and blocks β it waits and does nothing until Service B responds
- Like a phone call β both parties must be available at the same time
// Order service calls User service
const user = await fetch('http://user-service/users/42');
const data = await user.json();
// Only continues AFTER the response arrives
β Good for:
- Queries needing an immediate answer
- Example: "Is this user verified?" β you must know before continuing
β Risk:
- If User service is down or slow, Order service is also affected
- Creates tight coupling and potential cascading failures
2. Asynchronous Communication
- Service A publishes an event to a message broker and immediately moves on
- Other services subscribe and react in their own time
- Like a text message β you send it and continue with your day
// Order service publishes an event β no waiting
await kafka.send({
topic: 'order.placed',
messages: [{ value: JSON.stringify({ orderId: '99', userId: '42' }) }]
});
// Immediately continues β done!
β Good for:
- Work that doesn't need an immediate reply
- Example: "Order placed β send email, update inventory, charge card" β all can happen independently
β Key benefit:
- If Notification service is down, events queue up and process when it recovers β no cascading failures
β οΈ Tradeoff:
- Harder to debug
- Order of events not guaranteed
- Eventually consistent, not immediately consistent
π One Publish β Many Consumers (Fan-out)
A major async superpower β publish one event, multiple services react independently:
Order service β publishes "order.placed" β broker
βββ Notification svc (sends confirmation email)
βββ Inventory svc (reserves stock)
βββ Payment svc (initiates charge)
- Order service knows nothing about any of these consumers
- Add a new consumer (e.g. Analytics) without touching Order service at all
π§© The 4 Communication Patterns You'll Actually Encounter
Pattern 1 β REST over HTTP [Sync]
- The most common pattern in microservices
- Services call each other just like they call any external API β HTTP verbs, JSON body, status codes
GET /users/42
POST /orders { userId, items }
DELETE /sessions/abc
| β Simple | Everyone knows HTTP |
| β Easy to test | Works with curl / Postman |
| β Slower | Not ideal for high-throughput internal calls |
| β No schema enforcement | JSON can drift without you noticing |
Pattern 2 β gRPC [Sync]
- Google's protocol β you define a contract in a
.protofile, and code is auto-generated - Uses binary encoding over HTTP/2 β roughly 10x faster than REST for internal calls
// user.proto
service UserService {
rpc GetUser(UserRequest) returns (UserResponse);
}
| β Very fast | Binary + HTTP/2 |
| β Schema = contract | Can't drift accidentally |
| β More setup | Proto files to manage across services |
| β Harder to debug | Binary format β can't just read it like JSON |
Best for: High-throughput internal service-to-service calls where performance matters.
Pattern 3 β Message Queue [Async]
- One producer, one consumer
- Like a task queue β jobs pile up and a worker processes them one at a time
- Tools: RabbitMQ, Amazon SQS
Producer β [queue] β 1 Consumer
// Producer side
channel.sendToQueue('resize-image',
Buffer.from(JSON.stringify({ url }))
);
| β Decoupled | Producer doesn't wait for consumer |
| β Durable | Messages survive restarts, retries built in |
| β 1-to-1 only | Each message is processed by only one consumer |
Best for: Background jobs β image resizing, report generation, sending emails one at a time.
Pattern 4 β Pub / Sub (Event-Driven) [Async]
- One publisher, many subscribers
- The publisher doesn't know who is listening β complete loose coupling
- Tools: Apache Kafka, Amazon SNS
Publisher β "order.placed" β broker
βββ Notification svc β
subscribed
βββ Inventory svc β
subscribed
βββ Analytics svc β
subscribed
| β Zero coupling | Publisher is completely oblivious to consumers |
| β Infinitely extensible | Add new consumers with zero code changes to the publisher |
| β Eventual consistency | Data won't be in sync immediately everywhere |
| β Harder to trace | A single action fans out across many services |
Best for: Anything where one event should trigger multiple independent reactions.
π§ Quick Reference β When to Use What
| Pattern | Style | Use When |
|---|---|---|
| REST | Sync | You need an immediate answer. Simple internal calls. |
| gRPC | Sync | High-frequency internal calls where speed matters. |
| Message Queue | Async | One service does one background job per message. |
| Pub / Sub | Async | One event should trigger multiple independent services. |
β οΈ Key Gotchas to Remember
- Cascading failures β in sync communication, if one service is slow, the caller is slow too. Use timeouts and circuit breakers.
- Idempotency β async messages can be delivered more than once. Your consumer must handle duplicates safely.
- Event ordering β async brokers don't always guarantee order. Never assume a
payment.chargedarrives afterorder.placed. - Database per service β each service owns its own DB. Never share a database between two services β this recreates the tight coupling you were trying to escape.
- Observability β with many services, you need distributed tracing (e.g. Jaeger, OpenTelemetry) or debugging becomes a nightmare.