Message Queues vs Event Streams

Message Queues vs Event Streams

Message queues and event streams both decouple producers from consumers, but they have fundamentally different semantics. A message queue distributes work — each message goes to exactly one consumer and is deleted after acknowledgement. An event stream is a persistent log — messages are retained, and multiple independent consumers can each read the entire stream at their own pace.

Choosing the wrong one causes architectural pain: a queue where you need fan-out means duplicating messages to N queues; a stream where you need single-consumer task processing means building consumer-group coordination that a queue gives you for free.

Message Queue (RabbitMQ, SQS)

A message queue is designed for work distribution: a producer enqueues a task, and exactly one consumer dequeues and processes it.

sequenceDiagram
    participant P as Producer
    participant Q as Queue (RabbitMQ / SQS)
    participant C1 as Consumer 1
    participant C2 as Consumer 2
    participant C3 as Consumer 3

    P->>Q: Enqueue: "send email to user:42"
    P->>Q: Enqueue: "send email to user:43"
    P->>Q: Enqueue: "resize image img:99"

    Q->>C1: Deliver: "send email to user:42"
    Q->>C2: Deliver: "send email to user:43"
    Q->>C3: Deliver: "resize image img:99"

    C1->>Q: ACK ✓ (message deleted from queue)
    C2->>Q: ACK ✓ (message deleted from queue)
    C3->>Q: NACK ✗ (message re-queued for retry)

    Q->>C1: Redeliver: "resize image img:99"
    C1->>Q: ACK ✓

Key Properties

PropertyBehavior
DeliveryEach message delivered to exactly one consumer (competing consumers pattern)
RetentionMessage deleted after consumer ACK
OrderingFIFO within a single queue (best-effort in SQS standard; strict in SQS FIFO)
ReplayNot possible — message is gone after ACK
ScalingAdd more consumers to process faster (horizontal work distribution)
Back-pressureQueue depth grows when consumers are slow — natural buffer

RabbitMQ Specifics

RabbitMQ adds exchanges between producers and queues, supporting multiple routing patterns:

flowchart LR
    P[Producer] --> E[Exchange]

    subgraph "Exchange Types"
        E -->|Direct| Q1[Queue: email]
        E -->|Direct| Q2[Queue: sms]
        E -->|Fanout| Q3[Queue: audit-1]
        E -->|Fanout| Q4[Queue: audit-2]
        E -->|Topic: order.*| Q5[Queue: order-processing]
    end

    Q1 --> C1[Email Worker]
    Q2 --> C2[SMS Worker]
    Q3 --> C3[Audit Logger 1]
    Q4 --> C4[Audit Logger 2]
    Q5 --> C5[Order Processor]
Exchange TypeRoutingUse Case
DirectRoutes to queue whose binding key exactly matches the routing keyTask routing by type (email, sms, push)
FanoutBroadcasts to all bound queues (ignores routing key)Notifications to all services
TopicPattern matching on routing key (order.*, #.error)Event filtering by category
HeadersRoutes based on message header attributesComplex routing logic

RabbitMQ also supports publisher confirms (producer gets ACK when message is durably written) and consumer prefetch (limit how many unacked messages a consumer holds — prevents one slow consumer from hoarding all messages).

SQS Specifics

FeatureSQS StandardSQS FIFO
ThroughputUnlimited300 msg/s (3000 with batching)
OrderingBest-effort (may reorder)Strict FIFO per message group
DeduplicationNone (at-least-once)Content-based or explicit dedup ID (exactly-once within 5-min window)
Visibility timeoutMessage hidden from other consumers while being processed; returned to queue on timeoutSame
Dead Letter QueueAfter N failed attempts, message moved to DLQ for investigationSame

SQS FIFO’s message group ID allows parallel processing of independent groups while maintaining order within each group:

Message Group: "user:42" → messages for user 42 processed in order
Message Group: "user:43" → messages for user 43 processed in order (independently)

Both groups process in parallel, each group strictly ordered.

Event Stream (Kafka, Kinesis)

An event stream is a persistent, ordered, append-only log. Messages are not deleted after consumption — they’re retained for a configurable period (or forever). Multiple consumers can read the same data independently.

sequenceDiagram
    participant P as Producer
    participant K as Kafka Topic (3 partitions)
    participant CG1 as Consumer Group A (Analytics)
    participant CG2 as Consumer Group B (Search Index)
    participant CG3 as Consumer Group C (Notifications)

    P->>K: Publish: OrderCreated (key=user:42)
    Note over K: Appended to partition 1
offset 1042 par Independent consumers — each reads at own pace K->>CG1: OrderCreated (offset 1042) K->>CG2: OrderCreated (offset 1042) K->>CG3: OrderCreated (offset 1042) end CG1->>CG1: Commit offset 1042 Note over CG2: CG2 is slow — still processing offset 1040 CG3->>CG3: Commit offset 1042 Note over K: Message retained for 7 days
regardless of consumer progress

Key Properties

PropertyBehavior
DeliveryEach consumer group gets every message; within a group, each partition is consumed by one member
RetentionConfigurable: time-based (7 days default) or size-based; can be infinite (compacted topics)
OrderingStrict within a partition; no ordering across partitions
ReplayReset consumer offset to any point → re-process historical events
ScalingAdd partitions (parallelism = min(partitions, consumers in group))
Back-pressureConsumer lag grows (offset falls behind head); Kafka stores messages regardless

Kafka Partitioning and Consumer Groups

flowchart TD
    subgraph "Topic: orders (3 partitions)"
        P0[Partition 0
offset 0-1500] P1[Partition 1
offset 0-1200] P2[Partition 2
offset 0-1350] end subgraph "Consumer Group A (Analytics) — 3 members" A1[Consumer A1] -.->|assigned| P0 A2[Consumer A2] -.->|assigned| P1 A3[Consumer A3] -.->|assigned| P2 end subgraph "Consumer Group B (Search) — 2 members" B1[Consumer B1] -.->|assigned| P0 B1 -.->|assigned| P1 B2[Consumer B2] -.->|assigned| P2 end

Within a consumer group: Each partition is assigned to exactly one consumer. If there are more consumers than partitions, some consumers sit idle. If a consumer dies, its partitions are rebalanced to surviving consumers.

Across consumer groups: Each group maintains its own offset per partition. Group A’s progress is completely independent of Group B’s. This is how Kafka achieves fan-out without duplicating messages.

Kafka Retention and Replay

Topic: orders, retention = 7 days

Partition 0:
  offset 0 ─────── offset 500 ─────── offset 1500
  │                 │                  │
  (7 days ago)      (3 days ago)       (now, head)
  ↑                                    ↑
  expired, will be    Consumer Group A: offset 1490 (nearly caught up)
  garbage collected   Consumer Group B: offset 800 (3 days behind — replaying)

Replay use cases:

  • Bug fix: replay events through a corrected consumer to rebuild a search index
  • New service: a new consumer group starts from offset 0, processing all historical events
  • Disaster recovery: rebuild a downstream database from the event stream

Compacted Topics

For topics where you only care about the latest value per key (not the full history), Kafka supports log compaction:

Before compaction:
  offset 1: key=user:42, value={name:"Alice"}
  offset 2: key=user:43, value={name:"Bob"}
  offset 3: key=user:42, value={name:"Alice Smith"}   ← newer for user:42
  offset 4: key=user:43, value=null                    ← tombstone (delete)

After compaction:
  offset 3: key=user:42, value={name:"Alice Smith"}    ← latest value retained
  (user:43 deleted — tombstone processed)

Used for: Kafka Connect source connectors (CDC snapshots), Kafka Streams changelog topics, configuration distribution.

Head-to-Head Comparison

DimensionMessage Queue (RabbitMQ/SQS)Event Stream (Kafka/Kinesis)
Mental modelTask inbox — work items consumed and deletedAppend-only log — events retained and replayed
Consumer semanticsCompeting consumers (one takes the message)Consumer groups (each group gets all messages)
RetentionUntil ACK’dTime/size-based or indefinite
ReplayNoYes (reset offset)
OrderingPer-queue FIFOPer-partition FIFO
Fan-outRequires exchange fanout / SNS → SQSNative via consumer groups
ThroughputThousands–tens of thousands msg/sMillions msg/s (per cluster)
LatencySub-millisecond (RabbitMQ)Low milliseconds (Kafka batch optimization)
Message sizeFlexible (RabbitMQ: configurable; SQS: 256KB)Default 1MB (configurable)
ComplexityLow (RabbitMQ) / Very low (SQS)High (cluster ops, partitioning, rebalancing)

When to Use a Queue

Queues excel at task distribution — work that should be done exactly once by one worker.

Use CaseWhy Queue
Background jobs (send email, generate PDF, resize image)One worker processes one job. Job is done after ACK. No replay needed.
Rate limiting / throttlingQueue buffers bursts; consumers pull at their own pace
Request-reply (RPC over messaging)RabbitMQ’s reply-to + correlation-id pattern; temporary response queue
Delayed processingSQS delay queues (up to 15 min); RabbitMQ dead-letter exchange with TTL
Work that must not duplicateQueue’s single-consumer delivery prevents two workers processing the same task

When to Use a Stream

Streams excel at event distribution — facts about what happened that multiple systems need to react to.

Use CaseWhy Stream
Audit log / event sourcingImmutable, ordered record of everything that happened. Replay to rebuild state.
Fan-out to multiple systemsOrder created → analytics, search indexer, notification service, fraud detection — each reads independently
Real-time analyticsKafka Streams / Flink consume events for windowed aggregations, anomaly detection
Change Data Capture (CDC)Debezium → Kafka → multiple downstream consumers (cache invalidation, search sync, data lake)
Inter-service communication (event-driven architecture)Services publish domain events; other services consume what they need

Hybrid Patterns

Most production systems use both — Kafka for the durable event backbone, queues for specific work distribution.

flowchart TD
    subgraph Event Backbone
        P[Order Service] -->|OrderCreated| K[Kafka Topic: orders]
    end

    K --> A[Analytics Consumer Group
Writes to data warehouse] K --> S[Search Consumer Group
Updates Elasticsearch] K --> Bridge[Queue Bridge Consumer] subgraph Task Distribution Bridge -->|one task per order| SQS[SQS Queue: send-confirmation] SQS --> W1[Email Worker 1] SQS --> W2[Email Worker 2] end

Pattern: Kafka → SQS bridge

Kafka retains the event and fans it out to all interested consumer groups. One of those consumers is a “bridge” that enqueues a task into SQS for work that needs single-consumer, exactly-once-delivery semantics (sending a confirmation email). The email workers pull from SQS — no risk of two workers sending the same email.

Why not just use Kafka for everything?

  • Kafka consumers within a group already get per-partition single delivery — but rebalancing can cause duplicates (at-least-once)
  • SQS visibility timeout + DLQ provides simpler retry/failure semantics for task processing
  • SQS FIFO provides exactly-once within a 5-minute window with zero application code
  • Operational simplicity: SQS is fully managed with no cluster to run

Why not just use SQS for everything?

  • No replay — once consumed, the message is gone
  • No fan-out without SNS → multiple SQS queues (complex topology)
  • No consumer groups — each system needs its own queue with duplicated messages
  • No ordering guarantees in standard SQS; FIFO has throughput limits

Delivery Guarantees Across Both

SystemDefault GuaranteeExactly-Once Option
RabbitMQAt-least-once (publisher confirms + consumer ACK)No native exactly-once; use idempotent consumers
SQS StandardAt-least-once (may deliver duplicates)No
SQS FIFOExactly-once (dedup within 5-min window)Yes (built-in)
KafkaAt-least-once (default)Idempotent producer + transactions (within Kafka)
KinesisAt-least-onceNo native exactly-once; use idempotent consumers

Regardless of the system, design consumers to be idempotent. Network failures, rebalancing, and retries can always cause duplicates at the application level — even when the broker provides exactly-once semantics internally.

ℹ️

Interview framing: “For the notification fan-out, I’d use Kafka — one OrderCreated event is consumed independently by analytics, search, and notifications. For the actual email sending, I’d bridge from Kafka to SQS — that gives us single-consumer delivery with dead-letter queues for failed sends. Kafka handles the durable event log and fan-out; SQS handles the task distribution with simpler retry semantics.”