Unique ID Generation

Every row in your database needs an ID. In a single-server world, AUTO_INCREMENT works perfectly — the database hands out 1, 2, 3, 4, … and guarantees uniqueness. Now your system is distributed across 20 database shards, 50 application servers, and 3 regions. There is no single counter to increment. Two servers inserting simultaneously can’t coordinate fast enough to avoid collisions without a bottleneck. You need a scheme that generates globally unique IDs across all nodes, ideally without any coordination — and depending on your access patterns, those IDs may also need to be sortable by time.

Why Auto-Increment Breaks in Distributed Systems

Single database:
  Server A inserts → DB returns id=1001
  Server B inserts → DB returns id=1002
  ✓ Guaranteed unique, sequential, fast

Sharded database (3 shards):
  Server A → Shard 1 → id=1001
  Server B → Shard 2 → id=1001  ← collision!

  "Solution": each shard uses a different step
    Shard 1: 1, 4, 7, 10, ...
    Shard 2: 2, 5, 8, 11, ...
    Shard 3: 3, 6, 9, 12, ...
  Problem: adding Shard 4 requires re-numbering all shards

The fundamental tension: uniqueness requires coordination, but coordination limits throughput and introduces a single point of failure. Each ID generation scheme makes a different trade-off along this axis.

UUID v4: Pure Randomness

A UUID (Universally Unique Identifier) version 4 is 128 bits of randomness, formatted as a 36-character hex string with dashes.

Format: 550e8400-e29b-41d4-a716-446655440000
        xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx

        4 = version marker (v4)
        y = variant marker (8, 9, a, or b)

Effective randomness: 122 bits (6 bits reserved for version + variant)
Possible values: 2¹²² ≈ 5.3 × 10³⁶

import uuid

# Generate a UUID v4 — no coordination needed
id = uuid.uuid4()
print(id)   # e.g., "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d"
print(len(str(id)))  # 36 characters

Collision probability: With 122 random bits, you’d need to generate ~2.7 × 10¹⁸ UUIDs (2.7 quintillion) before having a 50% chance of a single collision (birthday paradox). In practice, collisions are not a concern.

The B-Tree Index Problem

UUIDs are random — they have no ordering relationship with time or insertion order. When used as a primary key in a database with a B-tree index, this causes page splits and fragmentation:

B-tree with sequential IDs (auto-increment):
  New rows always append to the rightmost leaf page
  → sequential I/O, high page fill factor, efficient

  [1,2,3,4,5] [6,7,8,9,10] [11,12,13,14,15] [16,17,← next insert]

B-tree with random UUIDs:
  New rows insert into random leaf pages across the tree
  → random I/O, frequent page splits, ~50% page fill factor

  [a1b.., c3d.., f7e..] ← insert "b2c.." here → page split!

Impact: Random UUIDs as primary keys can cause 2–5× slower write performance on large tables compared to sequential keys, due to cache misses and page splits. Read performance also suffers because related rows (e.g., recently inserted) are scattered across the index rather than being co-located.

Property	UUID v4
Size	128 bits (36 chars as string, 16 bytes binary)
Coordination	None — generate locally
Sortable by time	No
Index performance	Poor — random inserts cause B-tree fragmentation
Collision risk	Negligible (122 bits of randomness)
Human readability	Low — long hex string

Snowflake ID: Twitter’s Solution

Twitter needed IDs that were unique, sortable by time, fit in a 64-bit integer (for efficient storage and indexing), and could be generated without central coordination at thousands of IDs per second. The result was Snowflake.

Bit Layout

64-bit Snowflake ID:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┼───────────────────────────────────────────┼─────────┼───────────┤
│0│         41-bit timestamp (ms)             │ 10-bit  │  12-bit   │
│ │      (milliseconds since epoch)           │machine  │ sequence  │
│ │                                           │   ID    │  number   │
└─┴───────────────────────────────────────────┴─────────┴───────────┘

Bit 0:      Unused sign bit (always 0 → positive integer)
Bits 1–41:  Milliseconds since custom epoch (not Unix epoch)
            2⁴¹ ms ≈ 69 years from custom epoch
Bits 42–51: Machine/worker ID (0–1023 → up to 1024 workers)
Bits 52–63: Sequence number (0–4095 → up to 4096 IDs per ms per worker)

import time
import threading

class SnowflakeGenerator:
    """Twitter-style 64-bit unique ID generator."""

    # Custom epoch: Jan 1, 2020 00:00:00 UTC
    EPOCH = 1577836800000

    MACHINE_ID_BITS = 10
    SEQUENCE_BITS = 12

    MAX_MACHINE_ID = (1 << MACHINE_ID_BITS) - 1   # 1023
    MAX_SEQUENCE = (1 << SEQUENCE_BITS) - 1         # 4095

    MACHINE_ID_SHIFT = SEQUENCE_BITS                # 12
    TIMESTAMP_SHIFT = SEQUENCE_BITS + MACHINE_ID_BITS  # 22

    def __init__(self, machine_id: int):
        if machine_id < 0 or machine_id > self.MAX_MACHINE_ID:
            raise ValueError(f"Machine ID must be 0–{self.MAX_MACHINE_ID}")
        self.machine_id = machine_id
        self.sequence = 0
        self.last_timestamp = -1
        self.lock = threading.Lock()

    def generate(self) -> int:
        with self.lock:
            timestamp = self._current_time()

            if timestamp == self.last_timestamp:
                # Same millisecond — increment sequence
                self.sequence = (self.sequence + 1) & self.MAX_SEQUENCE
                if self.sequence == 0:
                    # Sequence exhausted — wait for next millisecond
                    timestamp = self._wait_next_ms(timestamp)
            elif timestamp > self.last_timestamp:
                self.sequence = 0
            else:
                # Clock moved backward — refuse to generate
                raise ClockMovedBackwardError(
                    f"Clock moved backward by "
                    f"{self.last_timestamp - timestamp}ms"
                )

            self.last_timestamp = timestamp

            return (
                ((timestamp - self.EPOCH) << self.TIMESTAMP_SHIFT)
                | (self.machine_id << self.MACHINE_ID_SHIFT)
                | self.sequence
            )

    def _current_time(self) -> int:
        return int(time.time() * 1000)

    def _wait_next_ms(self, last_ts: int) -> int:
        ts = self._current_time()
        while ts <= last_ts:
            ts = self._current_time()
        return ts


class ClockMovedBackwardError(Exception):
    pass

Extracting Timestamp from a Snowflake ID

Because the timestamp is embedded in the most significant bits, you can extract the creation time from any Snowflake ID:

def extract_timestamp(snowflake_id: int, epoch=1577836800000) -> float:
    """Extract Unix timestamp from a Snowflake ID."""
    timestamp_ms = (snowflake_id >> 22) + epoch
    return timestamp_ms / 1000

# Example:
gen = SnowflakeGenerator(machine_id=42)
sid = gen.generate()
print(f"ID: {sid}")
print(f"Created at: {extract_timestamp(sid)}")
print(f"Machine: {(sid >> 12) & 0x3FF}")
print(f"Sequence: {sid & 0xFFF}")

The Clock Skew Problem

Snowflake depends on a monotonically increasing clock. If the system clock jumps backward (NTP correction, VM migration, leap second), the generator would produce IDs with timestamps earlier than previously generated IDs — violating the sort order guarantee.

t=1000ms: generate ID → timestamp=1000, sequence=0
t=1001ms: generate ID → timestamp=1001, sequence=0
         NTP correction: clock jumps back to 999ms
t=999ms:  generate ID → timestamp=999, sequence=0
          ← This ID sorts BEFORE the t=1001 ID but was created AFTER it

Mitigations:

Refuse to generate: Throw an error if clock moves backward (the implementation above does this)
Wait it out: If the backward jump is small (< 5ms), spin-wait until the clock catches up
Use monotonic clock source: Linux CLOCK_MONOTONIC never goes backward (but isn’t wall-clock time)

Property	Snowflake
Size	64 bits (fits in a `BIGINT`)
Coordination	Machine ID assignment only (one-time setup)
Sortable by time	Yes — millisecond precision, IDs sort chronologically
Index performance	Excellent — roughly sequential inserts, B-tree friendly
Throughput	4,096 IDs/ms per machine = ~4M IDs/s per machine
Clock dependency	Critical — clock skew can violate ordering

ULID: Sortable + Random

ULID (Universally Unique Lexicographically Sortable Identifier) combines a timestamp with randomness in 128 bits, encoded as a 26-character URL-safe Base32 string.

ULID structure (128 bits):

 ├────── 48 bits ──────┼────────── 80 bits ──────────┤
 │    Timestamp (ms)   │        Randomness            │
 │  Unix epoch, ms     │   Cryptographically random   │
 └─────────────────────┴──────────────────────────────┘

Encoded: 01ARZ3NDEKTSV4RRFFQ69G5FAV
         ├──────────┼────────────────┤
         10 chars   16 chars
         timestamp  randomness

Crockford Base32: 0-9 A-Z excluding I L O U (avoids ambiguity)

import os
import time

# Crockford Base32 encoding table
ENCODING = "0123456789ABCDEFGHJKMNPQRSTVWXYZ"

def generate_ulid() -> str:
    """Generate a ULID: 48-bit timestamp + 80-bit random."""
    # Timestamp: milliseconds since Unix epoch
    timestamp_ms = int(time.time() * 1000)

    # Encode 48-bit timestamp as 10 Base32 characters
    ts_chars = []
    for _ in range(10):
        ts_chars.append(ENCODING[timestamp_ms & 0x1F])
        timestamp_ms >>= 5
    ts_part = "".join(reversed(ts_chars))

    # 80 bits of randomness as 16 Base32 characters
    random_bytes = os.urandom(10)  # 10 bytes = 80 bits
    random_int = int.from_bytes(random_bytes, "big")
    rand_chars = []
    for _ in range(16):
        rand_chars.append(ENCODING[random_int & 0x1F])
        random_int >>= 5
    rand_part = "".join(reversed(rand_chars))

    return ts_part + rand_part

# Example:
ulid = generate_ulid()
print(ulid)       # e.g., "01HZXK5P9RQCV3Z8KNTDMJ5WGE"
print(len(ulid))  # 26 characters

Why ULID Over UUID?

UUIDs sorted:                        ULIDs sorted:
  0a1b2c3d-...  (random)              01ARZ3ND-... (2024-01-15 10:00:00)
  9f8e7d6c-...  (random)              01ARZ3NF-... (2024-01-15 10:00:01)
  3c4d5e6f-...  (random)              01ARZ3NG-... (2024-01-15 10:00:02)
  ← no meaningful order                ← chronological order ✓

SELECT * FROM orders ORDER BY id DESC LIMIT 10;
  UUID:  returns 10 random orders — useless
  ULID:  returns 10 most recent orders — useful

ULIDs give you the zero-coordination property of UUIDs (just generate locally with randomness) plus the sort-order property of Snowflake (chronological ordering). The trade-off: they’re 128 bits instead of 64.

Monotonicity within the same millisecond: The ULID spec recommends that if multiple ULIDs are generated in the same millisecond, the random component should be incremented rather than regenerated — this preserves sort order within a millisecond.

Property	ULID
Size	128 bits (26 chars Base32, 16 bytes binary)
Coordination	None — generate locally
Sortable by time	Yes — millisecond precision, lexicographic sort = chronological sort
Index performance	Good — roughly sequential like Snowflake, with slight randomness
Collision risk	Within same ms: 2⁸⁰ random space ≈ negligible
Clock dependency	Moderate — not sortable across clock-skewed machines

Database Sequences

The simplest approach: let the database hand out IDs.

-- PostgreSQL sequence
CREATE SEQUENCE order_id_seq START 1 INCREMENT 1;

-- Use in a table
CREATE TABLE orders (
    id BIGINT DEFAULT nextval('order_id_seq') PRIMARY KEY,
    ...
);

-- Or explicitly
INSERT INTO orders (id, ...) VALUES (nextval('order_id_seq'), ...);

Advantages: Strongly ordered, gap-free (mostly), no application logic needed, works perfectly for single-database systems.

Problems at scale:

Single sequence, 3 app servers:

  Server A: nextval() → 1001  ← round-trip to DB
  Server B: nextval() → 1002  ← round-trip to DB
  Server C: nextval() → 1003  ← round-trip to DB

  Every ID requires a DB round-trip → the sequence is a bottleneck
  At 10K inserts/s, that's 10K round-trips/s just for IDs

Sequence with Block Allocation

To reduce round-trips, each application server pre-allocates a block of IDs:

class BlockAllocator:
    """Pre-allocate ID blocks from a central sequence."""

    def __init__(self, db, block_size=1000):
        self.db = db
        self.block_size = block_size
        self.current = 0
        self.ceiling = 0

    async def next_id(self) -> int:
        if self.current >= self.ceiling:
            await self._allocate_block()
        id = self.current
        self.current += 1
        return id

    async def _allocate_block(self):
        """Atomically reserve a block of IDs by bumping a counter row.

        Backed by a one-row-per-namespace table:
            CREATE TABLE id_blocks (
                name        TEXT PRIMARY KEY,
                next_start  BIGINT NOT NULL DEFAULT 1
            );

        The UPDATE acquires a row lock; concurrent allocators serialize
        on it and each gets a distinct, contiguous block.
        """
        row = await self.db.query_one(
            """
            UPDATE id_blocks
               SET next_start = next_start + %s
             WHERE name = %s
            RETURNING next_start - %s AS block_start
            """,
            (self.block_size, "order_ids", self.block_size),
        )
        self.current = row["block_start"]
        self.ceiling = self.current + self.block_size

Now each server only contacts the DB once per 1,000 IDs. The trade-off: IDs are no longer gap-free (if a server crashes with 800 unused IDs in its block, those IDs are lost) and IDs from different servers interleave rather than being strictly sequential.

Property	DB Sequence
Size	64 bits (`BIGINT`)
Coordination	Full — centralized sequence
Sortable by time	Approximately (within a single DB)
Index performance	Excellent — sequential
Throughput	Limited by DB round-trips (block allocation helps)
Single point of failure	Yes — the sequence DB

Choosing the Right Scheme

flowchart TB
    Start([Need unique IDs]) --> Q1{Must fit in
64-bit integer?}

    Q1 -->|Yes| Q2{Need time-based
sort order?}
    Q1 -->|No| Q3{Need time-based
sort order?}

    Q2 -->|Yes| Snow[Snowflake ID
64-bit, sortable, coordinated machine IDs]
    Q2 -->|No| Seq[DB Sequence
64-bit, sequential, centralized]

    Q3 -->|Yes| ULID[ULID
128-bit, sortable, zero coordination]
    Q3 -->|No| UUID[UUID v4
128-bit, random, zero coordination]

    style Snow fill:#bfb
    style ULID fill:#bfb
    style UUID fill:#fbb
    style Seq fill:#ffb

Head-to-Head Comparison

Property	UUID v4	Snowflake	ULID	DB Sequence
Size	128 bit (36 char)	64 bit (19 digits)	128 bit (26 char)	64 bit (up to 19 digits)
Coordination	None	Machine ID registry	None	Centralized DB
Time-sortable	No	Yes (ms)	Yes (ms)	Approximately
B-tree friendly	No (random)	Yes (sequential)	Yes (mostly sequential)	Yes (sequential)
Throughput/node	Unlimited	4,096/ms	Unlimited	DB-limited
Clock dependency	None	Critical	Moderate	None
Uniqueness guarantee	Statistical (122-bit random)	Structural (machine + sequence)	Statistical (80-bit random/ms)	Structural (single counter)
Cross-region safe	Yes	Yes (with unique machine IDs)	Yes	No (single DB)
URL friendly	No (dashes, hex)	Yes (plain integer)	Yes (Base32)	Yes (plain integer)

When to Use Each

Scheme	Best for	Avoid when
UUID v4	Simple systems, no sort requirement, distributed with no coordination budget	Primary key on write-heavy tables (B-tree fragmentation)
Snowflake	High-throughput systems needing 64-bit sortable IDs (Twitter, Discord, Instagram)	You can’t manage machine ID assignment; clock skew is frequent
ULID	Need UUID-like simplicity + sort order; Snowflake’s 64-bit limit is too tight	You need IDs to fit in a `BIGINT` column
DB Sequence	Single-region, single-database, simple applications	Multi-region, sharded databases, high write throughput

What Real Systems Use

System	ID scheme	Why
Twitter	Snowflake	64-bit, sortable, fits in JavaScript `Number.MAX_SAFE_INTEGER` (53-bit) with custom epoch
Discord	Snowflake variant	64-bit, sortable by time, used for messages + channels + users
Instagram	Snowflake variant	41-bit timestamp + 13-bit shard ID + 10-bit sequence
MongoDB	ObjectId (96-bit)	32-bit timestamp + 40-bit random + 24-bit counter — sortable, no coordination
Cassandra	TimeUUID (v1)	Timestamp-based UUID — sortable, used as clustering key
Stripe	Prefixed random	`ch_1a2b3c...` — prefix indicates type, body is random; not sortable

⚠️

JavaScript integer precision matters. JavaScript Number is a 64-bit IEEE 754 float, which only has 53 bits of integer precision. A full 64-bit Snowflake ID exceeds Number.MAX_SAFE_INTEGER (2⁵³ - 1 = 9,007,199,254,740,991). Solutions: serialize IDs as strings in JSON responses, or use a custom epoch that keeps timestamps within 53 bits for the foreseeable future. Twitter and Discord both serialize Snowflake IDs as strings in their APIs.

ℹ️

Interview tip: When asked about ID generation, say: “For a distributed system needing sortable IDs, I’d use Snowflake: 41-bit millisecond timestamp, 10-bit machine ID, 12-bit sequence — gives me 4,096 IDs per millisecond per machine in a 64-bit integer that sorts chronologically and is B-tree friendly. Each machine generates IDs independently with no coordination beyond the one-time machine ID assignment. The main risk is clock skew — if the clock moves backward, I’d refuse to generate IDs until it catches up. If I don’t need 64-bit compactness, ULID is simpler — 48-bit timestamp plus 80 bits of randomness, no machine ID management needed, and it’s lexicographically sortable. I’d avoid UUID v4 as a primary key on write-heavy tables because random inserts cause B-tree page splits.”

Test Your Understanding

Your Snowflake generator throws ClockMovedBackwardError during a routine NTP sync that adjusts the clock back by 200ms. Your service is generating 2,000 IDs/second and crashes. Is refusing to generate IDs the right behavior?

For 200ms, no — waiting is better than crashing. The implementation should distinguish small skew (< 5ms: spin-wait) from large skew (> 5s: refuse and alert). A 200ms backward jump means the generator should spin-wait 200ms, then resume. This costs at most 200ms × 2000 IDs/s = 400 IDs delayed — trivial.

Why refusing is dangerous: If you throw an exception, every request that needs an ID fails. A 200ms NTP correction is routine and expected. The real danger is large clock jumps (minutes), which indicate VM migration or hardware issues. Implement a tiered response: wait for small skew, refuse for large skew, and always log/alert.

Your team wants to migrate from UUID v4 to UUID v7 (time-ordered) for primary keys. UUID v7 is 128-bit like v4 but has a 48-bit timestamp prefix. A colleague says ‘we get Snowflake benefits with no machine ID coordination.’ What’s the catch?

UUID v7 solves the B-tree fragmentation problem (timestamp prefix makes inserts roughly sequential), but there are two catches:

Still 128 bits. Twice the size of Snowflake’s 64 bits. Indexes are larger, joins are slower, and you can’t fit them in a JavaScript Number (but you couldn’t with v4 either).
No structural uniqueness guarantee within a millisecond. Two servers generating UUIDs in the same millisecond rely on 74 bits of randomness for uniqueness — statistically safe but not structurally guaranteed like Snowflake’s machine_id + sequence.

UUID v7 is the right choice when you want drop-in UUID compatibility with sorted inserts and can’t manage machine IDs. Snowflake is better when you need 64-bit compactness or structural (not statistical) uniqueness.

Twitter chose 64-bit Snowflake IDs. Why not 128-bit? They had the engineering talent to handle either.

Three concrete reasons for 64-bit:

JavaScript Number.MAX_SAFE_INTEGER is 2⁵³. Even 64-bit IDs require string serialization in JSON. 128-bit would be worse — no native integer type in most languages handles 128 bits efficiently.
Storage and index efficiency. Twitter stores billions of tweets. A BIGINT primary key is 8 bytes; a UUID is 16 bytes. That doubles index memory for every table and secondary index. At Twitter’s scale, this is terabytes of extra RAM.
Throughput. 4,096 IDs/ms per machine was sufficient for Twitter’s tweet volume. They didn’t need the larger address space of 128 bits.

The constraint drove a better design. 64-bit forced a disciplined bit layout (timestamp + machine + sequence) that gives you extractable metadata from the ID itself — something UUIDs and ULIDs don’t provide as cleanly.

Distributed Job Scheduling