Key-Value Stores (Redis)
Redis is an in-memory data structure store. It is not just a cache — it is a shared, persistent, structured state layer that multiple services can read and write concurrently. Understanding what Redis actually is (versus what people assume it is) is the difference between using it well and misusing it.
Why Redis is Fast
Redis processes commands on a single thread in an event loop. This eliminates lock contention entirely — no mutex overhead, no deadlocks, no context switching between threads per request. Network I/O is handled via epoll / kqueue (non-blocking I/O multiplexing), so a single thread serves thousands of clients simultaneously.
Client A ──┐
Client B ──┤──► Event loop (single thread) ──► In-memory data structure ──► Response
Client C ──┘ processes one command
at a time, atomicallyAll single-command operations are inherently atomic — no two commands interleave. This is why INCR, SETNX, and LPUSH are safe for concurrent access without client-side locking.
Pipelining: Multiple commands can be batched into one network round-trip. Without pipelining, 10 commands at 1ms RTT = 10ms. With pipelining, 10 commands = 1ms + execution time. Critical for high-throughput paths.
# Without pipelining: 3 round trips
SET key1 val1
SET key2 val2
SET key3 val3
# With pipelining: 1 round trip
PIPELINE
SET key1 val1
SET key2 val2
SET key3 val3
EXECUTEData Structures
Redis exposes typed data structures, not a generic blob store. Choosing the right structure eliminates application-level logic that would otherwise require a separate read-modify-write cycle.
| Type | Encoding (internal) | Key operations | Use case |
|---|---|---|---|
| String | Raw bytes, int, embstr | GET, SET, INCR, EXPIRE, SETNX | Cache, counters, distributed locks, feature flags |
| List | Listpack (small) / Quicklist | LPUSH, RPUSH, LPOP, LRANGE, BLPOP | Message queue (FIFO/LIFO), activity feed, job queue |
| Set | Listpack / Hash table | SADD, SREM, SISMEMBER, SUNION, SINTER | Unique visitor tracking, tag systems, friend graph intersection |
| Sorted Set (ZSET) | Listpack / Skip list + hash table | ZADD, ZRANGE, ZRANK, ZRANGEBYSCORE | Leaderboards, priority queues, rate limiting windows, geo index |
| Hash | Listpack / Hash table | HGET, HSET, HMGET, HINCRBY | User profile (field-per-attribute), object storage |
| Bitmap | String (bit array) | SETBIT, GETBIT, BITCOUNT, BITOP | Daily active user tracking (1 bit per user ID), feature flags |
| HyperLogLog | Probabilistic structure | PFADD, PFCOUNT | Unique count estimation with ±0.81% error, O(1) space (~12 KB) |
| Stream | Radix tree of listpacks | XADD, XREAD, XREADGROUP, XACK | Durable message log with consumer groups; Kafka-lite |
| Geo | Sorted Set (geohash as score) | GEOADD, GEODIST, GEORADIUS | Points-of-interest nearby, delivery radius |
Sorted Sets in Depth
Sorted sets are the most versatile structure. Each member has an associated float score; members are always iterated in score order.
# Leaderboard: score = game points
ZADD leaderboard 9850 "alice"
ZADD leaderboard 8200 "bob"
ZADD leaderboard 9850 "carol" # tie with alice
ZREVRANGE leaderboard 0 9 WITHSCORES # top 10, highest score first
ZRANK leaderboard "bob" # rank of bob (0-indexed)
ZRANGEBYSCORE leaderboard 8000 9000 # members between score 8000–9000# Sliding window rate limiter: score = timestamp (unix ms)
ZADD user:123:requests 1700000000123 "req_abc"
ZREMRANGEBYSCORE user:123:requests 0 (now - window_ms)
ZCARD user:123:requests # current count in windowStreams in Depth
Streams are a persistent, append-only log with consumer group support. Unlike pub/sub (fire-and-forget), stream messages survive consumer restarts.
# Producer
XADD orders * user_id 42 item_id 99 qty 1
↑ auto-generated ID: timestamp-sequence
# Consumer group (multiple consumers, each gets a unique message)
XGROUP CREATE orders fulfillment $ MKSTREAM
XREADGROUP GROUP fulfillment worker1 COUNT 10 STREAMS orders >
XACK orders fulfillment <message-id> # mark processedStream vs Pub/Sub: Pub/Sub is fire-and-forget — if no subscriber is listening, the message is lost. Streams persist messages; a consumer that restarts reads from where it left off. Use XADD key MAXLEN ~ N ... to cap a stream at approximately N entries — Redis trims old messages as new ones arrive, preventing unbounded memory growth.
Persistence
Redis is in-memory — a restart loses all data unless persistence is configured.
RDB (Redis Database Snapshot)
Periodically forks the process and writes the entire dataset to a compact binary file (dump.rdb).
# redis.conf
save 900 1 # snapshot if at least 1 key changed in 900s
save 300 10 # snapshot if at least 10 keys changed in 300s
save 60 10000 # snapshot if at least 10000 keys changed in 60sFork + copy-on-write: The parent continues serving requests. The child has a point-in-time view of memory (copy-on-write pages). The child writes the snapshot without blocking the parent — but a write-heavy workload doubles memory usage temporarily during the fork.
| RDB | |
|---|---|
| Recovery time | Fast (load binary file) |
| Data loss on crash | Up to the last snapshot interval (minutes) |
| Disk size | Compact |
| Performance impact | Fork overhead; brief pause on very large datasets |
AOF (Append-Only File)
Every write command is appended to a log file (appendonly.aof). On restart, Redis replays the log to reconstruct state.
# redis.conf
appendonly yes
appendfsync everysec # flush to disk every second (default, recommended)
# appendfsync always # flush every command — max durability, lowest throughput
# appendfsync no # OS decides when to flush — fastest, highest data loss riskAOF rewrite: The log grows without bound. BGREWRITEAOF compacts it by rewriting the minimal set of commands to reconstruct current state (e.g., 1000 INCR key commands become SET key 1000).
| AOF | |
|---|---|
| Recovery time | Slower (replay every command) |
| Data loss on crash | At most 1 second (appendfsync everysec) |
| Disk size | Larger (command log, compacted by rewrite) |
| Performance impact | Continuous write to append log |
Hybrid Mode (Recommended)
Redis 4.0+. RDB snapshot as the base; AOF records only changes since the last snapshot. Combines fast recovery (load RDB) with minimal data loss (replay short AOF delta).
aof-use-rdb-preamble yes # enable hybrid modeHigh Availability: Redis Sentinel
Sentinel is an autonomous HA system — it monitors, notifies, and automatically fails over when a primary becomes unavailable.
┌─────────────┐
┌──────►│ Sentinel 1 │◄──────┐
│ └─────────────┘ │
│ │
┌───────────┴──┐ quorum (≥2 agree) ┌───┴──────────┐
│ Sentinel 2 │──────────────────────│ Sentinel 3 │
└───────────┬──┘ └──┬───────────┘
│ ┌─────────────┐ │
└────►│ Primary │◄───────┘
└──────┬──────┘
writes│
┌─────────┴──────────┐
┌───┴───┐ ┌────┴────┐
│ Rep1 │ │ Rep2 │
└───────┘ └─────────┘Failover process:
sequenceDiagram
participant S1 as Sentinel 1
participant S2 as Sentinel 2
participant S3 as Sentinel 3
participant P as Primary
participant R1 as Replica 1
participant C as Client
Note over P: Primary becomes unreachable
S1->>P: PING (no response)
S2->>P: PING (no response)
S3->>P: PING (no response)
S1->>S2: is-master-down-by-addr?
S2-->>S1: yes
S1->>S3: is-master-down-by-addr?
S3-->>S1: yes
Note over S1,S3: Quorum (2 of 3) — primary is down
S1->>S2: Request Sentinel leader vote
S2-->>S1: vote granted
Note over S1: Elected Sentinel leader
S1->>R1: SLAVEOF NO ONE (promote to primary)
R1-->>S1: OK — now primary
S1->>S3: Reconfigure: replicate from R1
C->>S1: SENTINEL get-master-addr-by-name mymaster
S1-->>C: R1's IP:port
C->>R1: Connect to new primaryClient requirement: Clients must be Sentinel-aware — they ask Sentinel for the current primary rather than connecting to a hardcoded IP.
Sentinel is not horizontal scale — there is still one primary handling all writes. Sentinel provides automatic failover, not sharding.
Horizontal Scale: Redis Cluster
Redis Cluster partitions the keyspace across multiple primary nodes using hash slots.
16384 hash slots distributed across N primary nodes:
Node A: slots 0–5460 (+ 1 or more replicas)
Node B: slots 5461–10922 (+ 1 or more replicas)
Node C: slots 10923–16383 (+ 1 or more replicas)Key-to-slot mapping: slot = CRC16(key) % 16384. Every client computes this locally. If a request lands on the wrong node, the node returns a MOVED redirection.
sequenceDiagram
participant C as Client
participant A as Node A (slots 0–5460)
participant B as Node B (slots 5461–10922)
C->>A: GET user:42
Note over A: CRC16("user:42") % 16384 = 7638
slot 7638 is not mine
A->>C: MOVED 7638 node-b:6379
C->>B: GET user:42 (retry on correct node)
B->>C: "alice"
Note over C: Caches slot 7638 → Node B
Future requests go directly to BHash tags: To ensure related keys land on the same slot (required for multi-key commands), use {tag}:
{user:42}:profile ─┐
{user:42}:sessions ─┤─ all hash to the same slot (CRC16 of "user:42")
{user:42}:cart ─┘
MGET {user:42}:profile {user:42}:sessions # valid — same slot
MGET user:profile:42 user:sessions:42 # may fail — different slotsHot key problem: A single slot cannot be distributed further. If one key (e.g., a viral product page) receives 100k req/s, it saturates one node. Mitigations: local in-process cache, read replicas with READONLY commands, key sharding with a suffix (product:99:{shard_0}, product:99:{shard_1}).
Use Cases and Patterns
Distributed Lock
Atomic acquisition with expiry prevents deadlocks if the lock holder crashes.
# Acquire: SET key value NX EX ttl_seconds
SET lock:order:42 "worker-uuid-abc" NX EX 30
# NX: only set if key does not exist
# EX: auto-expire after 30s even if worker crashes
# Release: compare-and-delete (Lua for atomicity)
if redis.call("GET", KEYS[1]) == ARGV[1] then
return redis.call("DEL", KEYS[1])
endWhy check value on release: Without the check, worker A could acquire the lock, take longer than TTL, the lock expires, worker B acquires it, then worker A releases worker B’s lock.
Redlock: For multi-node safety, acquire the lock on N/2+1 independent Redis instances within the TTL window. Controversial (Martin Kleppmann argues clock drift undermines the guarantee); in practice, single-node locks suffice for most systems if the worst case (losing the lock on node failure) is acceptable.
Rate Limiting
# Fixed window counter
INCR rate:user:42:2024010115 # key = user + time bucket (hour)
EXPIRE rate:user:42:2024010115 3600
# Sliding window: sorted set (see Sorted Sets above)Session Store
SETEX session:token_abc 3600 '{"user_id":42,"role":"admin"}'
# SETEX = SET + EXPIRE in one command
GET session:token_abc
DEL session:token_abc # logoutPub/Sub
# Subscriber
SUBSCRIBE notifications:user:42
# Publisher (different connection)
PUBLISH notifications:user:42 '{"type":"order_shipped","order":99}'Pub/Sub limitations: Fire-and-forget. A subscriber that disconnects misses all messages sent while offline. For guaranteed delivery, use Streams with consumer groups instead.
Cache-Aside Pattern
# Read
value = redis.GET(key)
if value is None:
value = db.query(...)
redis.SETEX(key, ttl, value)
return value
# Write (invalidate on update)
db.update(...)
redis.DEL(key) # invalidate; next read repopulates from DBCache stampede: When a hot key expires, many concurrent requests all miss and hit the DB simultaneously. Mitigations: staggered TTL with random jitter, probabilistic early recomputation, or locking the first miss (only one request fetches from DB; others wait).
Memory Management
Redis stores everything in RAM. When maxmemory is reached, Redis applies an eviction policy rather than returning errors (unless noeviction is set).
maxmemory 4gb
maxmemory-policy allkeys-lfu| Policy | What gets evicted | When to use |
|---|---|---|
noeviction | Nothing — return error | Persistent data store; cannot afford data loss |
allkeys-lru | Least recently used across all keys | General cache — evict cold data regardless of TTL |
volatile-lru | LRU from keys with TTL set | Mix of persistent and cached data; preserve no-TTL keys |
allkeys-lfu | Least frequently used across all keys | Workloads with long-lived hot keys that LRU would evict after a quiet period |
volatile-lfu | LFU from keys with TTL set | Same as volatile-lru but frequency-based |
allkeys-random | Random across all keys | Uniform access pattern (rarely the right choice) |
volatile-random | Random from keys with TTL | Rarely used |
volatile-ttl | Key with shortest remaining TTL | Evict soonest-to-expire keys first |
LRU vs LFU: LRU evicts the key not accessed for the longest time — a key accessed heavily for a week but quiet for 10 minutes could be evicted. LFU tracks access frequency over time and is more resistant to bursty patterns. Prefer allkeys-lfu for caches with skewed access distributions.
Redis memory overhead: Every key has ~50–80 bytes of overhead (pointer, encoding metadata, expiry). A million small keys adds ~50–80 MB of overhead before storing any values. Monitor INFO memory → mem_fragmentation_ratio; values above 1.5 indicate memory fragmentation that MEMORY PURGE or a restart can reclaim.