Replication Implementations
A leader-based replication system needs a mechanism to get data changes from the leader to the followers. The choice of mechanism affects what you can do with the replicated data — version compatibility, cross-engine replication, change data capture, and zero-downtime upgrades all depend on it.
Statement-Based Replication
The leader logs every write statement (INSERT, UPDATE, DELETE) and sends them to each follower. The follower parses and executes each statement as if it received it from a client.
sequenceDiagram
participant L as Leader
participant F as Follower
L->>L: Execute INSERT INTO orders VALUES (...)
L->>F: Forward statement text
F->>F: Parse and execute same statement
Note over F: Result must match leader — but might notWhere it breaks:
| Problem | Example | Why it fails |
|---|---|---|
| Nondeterministic functions | NOW(), RAND(), UUID() | Each replica evaluates independently → different values |
| Auto-increment columns | INSERT with AUTO_INCREMENT | Followers must execute in the exact same order as the leader |
| Side effects | Triggers, stored procedures, UDFs | May depend on local state that differs across replicas |
Workaround: The leader can replace nondeterministic function calls with fixed return values in the statement log (MySQL did this in early versions).
Verdict: Fragile. Largely abandoned in favor of row-based replication. MySQL used statement-based replication by default before version 5.7.7.
Write-Ahead Log (WAL) Shipping
The leader’s WAL — the sequential log of every byte-level change to data pages — is streamed directly to the follower. The follower replays the same page-level changes to reconstruct an identical on-disk state.
sequenceDiagram
participant L as Leader
participant W as WAL (disk)
participant F as Follower
L->>W: Write WAL record (page 42, offset 128, old→new bytes)
L->>F: Stream WAL record
F->>F: Apply byte-level change to local page 42
Note over L,F: Physical replication — follower is byte-identical to leaderPostgreSQL streaming replication uses this approach. The standby connects to the primary and continuously receives WAL records:
# postgresql.conf on primary
wal_level = replica
max_wal_senders = 5
# recovery.conf on standby (or standby.signal in PG 12+)
primary_conninfo = 'host=primary-host port=5432'Limitation: Storage Engine Coupling
WAL records describe which bytes changed in which disk blocks. This ties replication to the exact storage format — the leader and follower must run:
- The same storage engine
- The same database version (or a compatible one)
This makes zero-downtime rolling upgrades difficult: you cannot run the leader on v15 and a follower on v16 if the on-disk page format changed between versions. The standard approach is to upgrade the follower first, promote it, then upgrade the old leader — but this requires the WAL format to be backward-compatible between those two versions.
Logical (Row-Based) Log Replication
Instead of shipping physical byte changes, the leader writes a logical log — a description of what changed at the row level, decoupled from the storage engine format.
Log Content
| Operation | What the log records |
|---|---|
INSERT | New values of all columns |
DELETE | Enough to identify the row (primary key, or all column values if no PK) |
UPDATE | Row identifier + new values of changed columns |
COMMIT | Marker indicating a transaction’s set of changes is complete |
sequenceDiagram
participant L as Leader
participant Log as Logical Log
participant F as Follower
participant CDC as External Consumer
L->>Log: Row-level change record (INSERT, id=7, name='Alice', ...)
Log->>F: Replicate via logical decoding
Log->>CDC: Stream to Kafka / data warehouse
Note over F,CDC: Same log serves replication AND change data captureMySQL binlog (in ROW format, default since 5.7.7) and PostgreSQL logical replication (via pgoutput plugin) both implement this approach:
-- PostgreSQL: create a logical replication publication
CREATE PUBLICATION my_pub FOR TABLE orders, customers;
-- On the subscriber
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=primary-host dbname=mydb'
PUBLICATION my_pub;-- MySQL: verify binlog format
SHOW VARIABLES LIKE 'binlog_format'; -- should return ROWWhy This Is the Standard Today
| Advantage | Why it matters |
|---|---|
| Version independence | Leader and follower can run different DB versions — the log format is stable across releases |
| Cross-engine replication | PostgreSQL → MySQL is possible via logical decoding |
| Change data capture (CDC) | External systems (Kafka, data warehouses) consume the same log stream — this is the foundation of the Outbox Pattern |
| Selective replication | Replicate a subset of tables or columns |
| Human-readable | Row-level changes are easier to debug than byte-level diffs |
Trigger-Based Replication
When replication requirements go beyond what the database engine provides — replicating a subset of data, moving between different database types, or applying custom conflict resolution — the replication logic moves to the application layer.
Mechanisms:
| Tool | How it works |
|---|---|
| Triggers + stored procedures | A BEFORE/AFTER trigger fires on every write, logging the change to a separate audit table. An external process reads that table and applies changes to the target. |
| Log-reading tools | Oracle GoldenGate, Debezium, Maxwell read the database’s transaction log and emit change events to an external system (Kafka, another database). |
-- Example: trigger-based change capture
CREATE TABLE orders_changelog (
id SERIAL PRIMARY KEY,
order_id INT,
operation VARCHAR(10),
changed_at TIMESTAMP DEFAULT NOW(),
new_data JSONB
);
CREATE OR REPLACE FUNCTION capture_order_change() RETURNS TRIGGER AS $$
BEGIN
INSERT INTO orders_changelog (order_id, operation, new_data)
VALUES (NEW.id, TG_OP, row_to_json(NEW)::jsonb);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER orders_audit
AFTER INSERT OR UPDATE ON orders
FOR EACH ROW EXECUTE FUNCTION capture_order_change();Trade-off: higher overhead per write (trigger execution + changelog table insert) and more moving parts to maintain. Use this only when built-in replication cannot satisfy the requirement.
Comparison
| Method | Coupling | Version Compat | CDC Support | Performance | Status |
|---|---|---|---|---|---|
| Statement-based | Low | Good | Poor | Good | Legacy — largely replaced |
| WAL shipping | High (byte-level) | Poor (same version) | None | Best (no transformation) | Standard for physical standbys |
| Logical (row-based) | Low | Good | Excellent | Good (slight overhead) | Standard for most use cases |
| Trigger-based | Low | Good | Custom | Worst (trigger overhead) | Niche — custom requirements only |
Interview tip: When discussing database replication, say: “I’d use logical replication — it decouples the replication format from the storage engine, supports cross-version upgrades, and doubles as a CDC stream. For a hot standby where byte-identical state and minimal lag matter (like a failover target), WAL shipping is better because it skips the logical decoding overhead.” This shows you understand the tradeoff, not just the mechanism.
Test Your Understanding
You upgrade PostgreSQL from v15 to v16. Physical (WAL) replication stops working between the old primary and new-version replica. Why?
WAL format is version-specific. Physical replication ships raw WAL bytes (page-level changes), which depend on the internal storage format. Different major versions have different page layouts, WAL record formats, and internal data structures. A v16 replica can’t interpret v15 WAL records.
Logical replication works cross-version because it ships logical changes (INSERT row X, UPDATE row Y) decoded from the WAL, independent of the internal storage format. This is why logical replication is the standard tool for zero-downtime major version upgrades: set up logical replication from v15 to v16, let it catch up, then failover.
You use logical replication to feed Elasticsearch from PostgreSQL. A large UPDATE touching 1 million rows runs on the primary. The replica (and ES) lag spikes to 30 minutes. Why?
Logical decoding overhead. Each of the 1 million row updates must be: (1) decoded from the WAL into a logical change event, (2) serialized to the replication protocol, (3) sent to the subscriber, and (4) applied one at a time on the subscriber.
Physical replication would ship the WAL bytes directly with minimal processing. Logical decoding adds CPU overhead per row and serialization cost. The subscriber also applies changes sequentially by default (single-threaded apply), creating a bottleneck.
Fixes: Batch large updates into smaller transactions. Use parallel apply workers (PostgreSQL 16+ supports parallel logical replication apply). Or for the ES use case, use CDC (Debezium) which is optimized for high-throughput change streaming.