Message Queues vs Event Streams
Message queues and event streams both decouple producers from consumers, but they have fundamentally different semantics. A message queue distributes work — each message goes to exactly one consumer and is deleted after acknowledgement. An event stream is a persistent log — messages are retained, and multiple independent consumers can each read the entire stream at their own pace.
Choosing the wrong one causes architectural pain: a queue where you need fan-out means duplicating messages to N queues; a stream where you need single-consumer task processing means building consumer-group coordination that a queue gives you for free.
Message Queue (RabbitMQ, SQS)
A message queue is designed for work distribution: a producer enqueues a task, and exactly one consumer dequeues and processes it.
sequenceDiagram
participant P as Producer
participant Q as Queue (RabbitMQ / SQS)
participant C1 as Consumer 1
participant C2 as Consumer 2
participant C3 as Consumer 3
P->>Q: Enqueue: "send email to user:42"
P->>Q: Enqueue: "send email to user:43"
P->>Q: Enqueue: "resize image img:99"
Q->>C1: Deliver: "send email to user:42"
Q->>C2: Deliver: "send email to user:43"
Q->>C3: Deliver: "resize image img:99"
C1->>Q: ACK ✓ (message deleted from queue)
C2->>Q: ACK ✓ (message deleted from queue)
C3->>Q: NACK ✗ (message re-queued for retry)
Q->>C1: Redeliver: "resize image img:99"
C1->>Q: ACK ✓Key Properties
| Property | Behavior |
|---|---|
| Delivery | Each message delivered to exactly one consumer (competing consumers pattern) |
| Retention | Message deleted after consumer ACK |
| Ordering | FIFO within a single queue (best-effort in SQS standard; strict in SQS FIFO) |
| Replay | Not possible — message is gone after ACK |
| Scaling | Add more consumers to process faster (horizontal work distribution) |
| Back-pressure | Queue depth grows when consumers are slow — natural buffer |
RabbitMQ Specifics
RabbitMQ adds exchanges between producers and queues, supporting multiple routing patterns:
flowchart LR
P[Producer] --> E[Exchange]
subgraph "Exchange Types"
E -->|Direct| Q1[Queue: email]
E -->|Direct| Q2[Queue: sms]
E -->|Fanout| Q3[Queue: audit-1]
E -->|Fanout| Q4[Queue: audit-2]
E -->|Topic: order.*| Q5[Queue: order-processing]
end
Q1 --> C1[Email Worker]
Q2 --> C2[SMS Worker]
Q3 --> C3[Audit Logger 1]
Q4 --> C4[Audit Logger 2]
Q5 --> C5[Order Processor]| Exchange Type | Routing | Use Case |
|---|---|---|
| Direct | Routes to queue whose binding key exactly matches the routing key | Task routing by type (email, sms, push) |
| Fanout | Broadcasts to all bound queues (ignores routing key) | Notifications to all services |
| Topic | Pattern matching on routing key (order.*, #.error) | Event filtering by category |
| Headers | Routes based on message header attributes | Complex routing logic |
RabbitMQ also supports publisher confirms (producer gets ACK when message is durably written) and consumer prefetch (limit how many unacked messages a consumer holds — prevents one slow consumer from hoarding all messages).
SQS Specifics
| Feature | SQS Standard | SQS FIFO |
|---|---|---|
| Throughput | Unlimited | 300 msg/s (3000 with batching) |
| Ordering | Best-effort (may reorder) | Strict FIFO per message group |
| Deduplication | None (at-least-once) | Content-based or explicit dedup ID (exactly-once within 5-min window) |
| Visibility timeout | Message hidden from other consumers while being processed; returned to queue on timeout | Same |
| Dead Letter Queue | After N failed attempts, message moved to DLQ for investigation | Same |
SQS FIFO’s message group ID allows parallel processing of independent groups while maintaining order within each group:
Message Group: "user:42" → messages for user 42 processed in order
Message Group: "user:43" → messages for user 43 processed in order (independently)
Both groups process in parallel, each group strictly ordered.Event Stream (Kafka, Kinesis)
An event stream is a persistent, ordered, append-only log. Messages are not deleted after consumption — they’re retained for a configurable period (or forever). Multiple consumers can read the same data independently.
sequenceDiagram
participant P as Producer
participant K as Kafka Topic (3 partitions)
participant CG1 as Consumer Group A (Analytics)
participant CG2 as Consumer Group B (Search Index)
participant CG3 as Consumer Group C (Notifications)
P->>K: Publish: OrderCreated (key=user:42)
Note over K: Appended to partition 1
offset 1042
par Independent consumers — each reads at own pace
K->>CG1: OrderCreated (offset 1042)
K->>CG2: OrderCreated (offset 1042)
K->>CG3: OrderCreated (offset 1042)
end
CG1->>CG1: Commit offset 1042
Note over CG2: CG2 is slow — still processing offset 1040
CG3->>CG3: Commit offset 1042
Note over K: Message retained for 7 days
regardless of consumer progressKey Properties
| Property | Behavior |
|---|---|
| Delivery | Each consumer group gets every message; within a group, each partition is consumed by one member |
| Retention | Configurable: time-based (7 days default) or size-based; can be infinite (compacted topics) |
| Ordering | Strict within a partition; no ordering across partitions |
| Replay | Reset consumer offset to any point → re-process historical events |
| Scaling | Add partitions (parallelism = min(partitions, consumers in group)) |
| Back-pressure | Consumer lag grows (offset falls behind head); Kafka stores messages regardless |
Kafka Partitioning and Consumer Groups
flowchart TD
subgraph "Topic: orders (3 partitions)"
P0[Partition 0
offset 0-1500]
P1[Partition 1
offset 0-1200]
P2[Partition 2
offset 0-1350]
end
subgraph "Consumer Group A (Analytics) — 3 members"
A1[Consumer A1] -.->|assigned| P0
A2[Consumer A2] -.->|assigned| P1
A3[Consumer A3] -.->|assigned| P2
end
subgraph "Consumer Group B (Search) — 2 members"
B1[Consumer B1] -.->|assigned| P0
B1 -.->|assigned| P1
B2[Consumer B2] -.->|assigned| P2
endWithin a consumer group: Each partition is assigned to exactly one consumer. If there are more consumers than partitions, some consumers sit idle. If a consumer dies, its partitions are rebalanced to surviving consumers.
Across consumer groups: Each group maintains its own offset per partition. Group A’s progress is completely independent of Group B’s. This is how Kafka achieves fan-out without duplicating messages.
Kafka Retention and Replay
Topic: orders, retention = 7 days
Partition 0:
offset 0 ─────── offset 500 ─────── offset 1500
│ │ │
(7 days ago) (3 days ago) (now, head)
↑ ↑
expired, will be Consumer Group A: offset 1490 (nearly caught up)
garbage collected Consumer Group B: offset 800 (3 days behind — replaying)Replay use cases:
- Bug fix: replay events through a corrected consumer to rebuild a search index
- New service: a new consumer group starts from offset 0, processing all historical events
- Disaster recovery: rebuild a downstream database from the event stream
Compacted Topics
For topics where you only care about the latest value per key (not the full history), Kafka supports log compaction:
Before compaction:
offset 1: key=user:42, value={name:"Alice"}
offset 2: key=user:43, value={name:"Bob"}
offset 3: key=user:42, value={name:"Alice Smith"} ← newer for user:42
offset 4: key=user:43, value=null ← tombstone (delete)
After compaction:
offset 3: key=user:42, value={name:"Alice Smith"} ← latest value retained
(user:43 deleted — tombstone processed)Used for: Kafka Connect source connectors (CDC snapshots), Kafka Streams changelog topics, configuration distribution.
Head-to-Head Comparison
| Dimension | Message Queue (RabbitMQ/SQS) | Event Stream (Kafka/Kinesis) |
|---|---|---|
| Mental model | Task inbox — work items consumed and deleted | Append-only log — events retained and replayed |
| Consumer semantics | Competing consumers (one takes the message) | Consumer groups (each group gets all messages) |
| Retention | Until ACK’d | Time/size-based or indefinite |
| Replay | No | Yes (reset offset) |
| Ordering | Per-queue FIFO | Per-partition FIFO |
| Fan-out | Requires exchange fanout / SNS → SQS | Native via consumer groups |
| Throughput | Thousands–tens of thousands msg/s | Millions msg/s (per cluster) |
| Latency | Sub-millisecond (RabbitMQ) | Low milliseconds (Kafka batch optimization) |
| Message size | Flexible (RabbitMQ: configurable; SQS: 256KB) | Default 1MB (configurable) |
| Complexity | Low (RabbitMQ) / Very low (SQS) | High (cluster ops, partitioning, rebalancing) |
When to Use a Queue
Queues excel at task distribution — work that should be done exactly once by one worker.
| Use Case | Why Queue |
|---|---|
| Background jobs (send email, generate PDF, resize image) | One worker processes one job. Job is done after ACK. No replay needed. |
| Rate limiting / throttling | Queue buffers bursts; consumers pull at their own pace |
| Request-reply (RPC over messaging) | RabbitMQ’s reply-to + correlation-id pattern; temporary response queue |
| Delayed processing | SQS delay queues (up to 15 min); RabbitMQ dead-letter exchange with TTL |
| Work that must not duplicate | Queue’s single-consumer delivery prevents two workers processing the same task |
When to Use a Stream
Streams excel at event distribution — facts about what happened that multiple systems need to react to.
| Use Case | Why Stream |
|---|---|
| Audit log / event sourcing | Immutable, ordered record of everything that happened. Replay to rebuild state. |
| Fan-out to multiple systems | Order created → analytics, search indexer, notification service, fraud detection — each reads independently |
| Real-time analytics | Kafka Streams / Flink consume events for windowed aggregations, anomaly detection |
| Change Data Capture (CDC) | Debezium → Kafka → multiple downstream consumers (cache invalidation, search sync, data lake) |
| Inter-service communication (event-driven architecture) | Services publish domain events; other services consume what they need |
Hybrid Patterns
Most production systems use both — Kafka for the durable event backbone, queues for specific work distribution.
flowchart TD
subgraph Event Backbone
P[Order Service] -->|OrderCreated| K[Kafka Topic: orders]
end
K --> A[Analytics Consumer Group
Writes to data warehouse]
K --> S[Search Consumer Group
Updates Elasticsearch]
K --> Bridge[Queue Bridge Consumer]
subgraph Task Distribution
Bridge -->|one task per order| SQS[SQS Queue: send-confirmation]
SQS --> W1[Email Worker 1]
SQS --> W2[Email Worker 2]
endPattern: Kafka → SQS bridge
Kafka retains the event and fans it out to all interested consumer groups. One of those consumers is a “bridge” that enqueues a task into SQS for work that needs single-consumer, exactly-once-delivery semantics (sending a confirmation email). The email workers pull from SQS — no risk of two workers sending the same email.
Why not just use Kafka for everything?
- Kafka consumers within a group already get per-partition single delivery — but rebalancing can cause duplicates (at-least-once)
- SQS visibility timeout + DLQ provides simpler retry/failure semantics for task processing
- SQS FIFO provides exactly-once within a 5-minute window with zero application code
- Operational simplicity: SQS is fully managed with no cluster to run
Why not just use SQS for everything?
- No replay — once consumed, the message is gone
- No fan-out without SNS → multiple SQS queues (complex topology)
- No consumer groups — each system needs its own queue with duplicated messages
- No ordering guarantees in standard SQS; FIFO has throughput limits
Delivery Guarantees Across Both
| System | Default Guarantee | Exactly-Once Option |
|---|---|---|
| RabbitMQ | At-least-once (publisher confirms + consumer ACK) | No native exactly-once; use idempotent consumers |
| SQS Standard | At-least-once (may deliver duplicates) | No |
| SQS FIFO | Exactly-once (dedup within 5-min window) | Yes (built-in) |
| Kafka | At-least-once (default) | Idempotent producer + transactions (within Kafka) |
| Kinesis | At-least-once | No native exactly-once; use idempotent consumers |
Regardless of the system, design consumers to be idempotent. Network failures, rebalancing, and retries can always cause duplicates at the application level — even when the broker provides exactly-once semantics internally.
Interview framing: “For the notification fan-out, I’d use Kafka — one OrderCreated event is consumed independently by analytics, search, and notifications. For the actual email sending, I’d bridge from Kafka to SQS — that gives us single-consumer delivery with dead-letter queues for failed sends. Kafka handles the durable event log and fan-out; SQS handles the task distribution with simpler retry semantics.”