Vector Clocks: Knowing When Events Are Truly Concurrent

Series: System Design · Distributed Systems — Pillar 8 of 8
Systems Design
| # | Post | What it covers |
|---|---|---|
| 00 | Distributed Systems: What Happens When Machines Disagree | Twenty concepts covering network partitions, consensus, clocks, distributed transactions, CDC, erasure coding, and observability. The final pillar. |
| 01 | Network Partitions: The Failure Mode You Can't Design Away | Network partitions are inevitable. Learn what happens when nodes can't communicate, how systems choose between availability and consistency, and what that means in practice. |
| 02 | Split-Brain: When Two Nodes Both Think They're the Leader | Split-brain occurs when two nodes both believe they're the primary. Learn how it happens, why it causes data corruption, and how STONITH and fencing prevent it. |
| 03 | Heartbeats: How Nodes Know Their Peers Are Alive | Heartbeats let nodes detect peer failures. Learn how timeouts, phi accrual failure detectors, and the tradeoff between false positives and detection speed work. |
| 04 | Leader Election: Agreeing on Who's in Charge | Leader election coordinates which node acts as primary. Learn the bully algorithm, Raft-based election, and why exactly-one-leader guarantees are hard to achieve. |
| 05 | Consensus Algorithms: Agreeing on a Value Across Failures | Consensus lets distributed nodes agree on a value despite failures. Learn what FLP impossibility means, what Paxos and Raft provide, and where consensus is used. |
| 06 | Quorum: How Many Nodes Must Agree? | Quorum determines how many nodes must agree for an operation to succeed. Learn how R + W > N ensures consistency in distributed databases like Cassandra and DynamoDB. |
| 07 | Paxos: The Algorithm That Started It All | Paxos is the foundational distributed consensus algorithm. Learn how its two phases work, why it's hard to implement, and what systems use it in production. |
| 08 | Raft: Consensus Made Understandable | Raft makes distributed consensus understandable. Learn how leader election, log replication, and safety work in the algorithm that powers etcd, CockroachDB, and TiKV. |
| 09 | Gossip Protocol: Decentralised Cluster Communication | Gossip protocols propagate information across a cluster without a central coordinator. Learn how epidemic spreading works and where it's used in production. |
| 10 | Logical Clocks: When Physical Time Isn't Enough | Physical clocks drift and can't establish event order in distributed systems. Logical clocks track causality instead. Learn why this matters and how it works. |
| 11 | Lamport Timestamps: Ordering Events Without a Global Clock | Lamport timestamps assign logical counters to events to establish causal order in distributed systems. Learn how they work and what they can and can't tell you. |
| 12 | Vector Clocks: Knowing When Events Are Truly Concurrent ← you are here | Vector clocks detect causality and concurrency in distributed systems. Learn how they work, how Dynamo uses them for conflict detection, and their limitations. |
| 13 | Distributed Transactions: When One Machine Isn't Enough | Distributed transactions are hard. Learn why cross-service atomicity is expensive, when to use it, and when eventual consistency is the right alternative. |
| 14 | Two-Phase Commit: Coordinating a Distributed Decision | 2PC ensures distributed atomicity through prepare and commit phases. Learn how it works, the coordinator failure problem, and why it's rarely used in modern systems. |
| 15 | Three-Phase Commit: Solving 2PC's Blocking Problem | 3PC adds a pre-commit phase to eliminate 2PC's blocking problem. Learn how it works, what assumptions it requires, and why it's rarely used in production. |
| 16 | Delivery Semantics: What Does "Delivered" Actually Mean? | Message delivery guarantees define system reliability. Learn what at-most-once, at-least-once, and exactly-once mean, what they cost, and when each is appropriate. |
| 17 | Change Data Capture: Streaming Your Database in Real Time | CDC streams database changes in real time by reading the write-ahead log. Learn how Debezium works, what CDC enables, and when to use it. |
| 18 | Erasure Coding: Fault Tolerance Without Full Replication | Erasure coding stores data across nodes using math, not full replication. Learn how Reed-Solomon works, how S3 uses it, and when it beats 3x replication. |
| 19 | Merkle Trees: Efficiently Finding What's Different | Merkle trees efficiently detect which parts of a large dataset differ between nodes. Learn how Bitcoin, Cassandra, and Git use them for verification and anti-entropy. |
| 20 | Observability: Understanding Your System at Runtime | Logs, metrics, and distributed traces are how you understand a system at runtime. Learn what each pillar provides, the tools involved, and how they work together. |
| 21 | Distributed Systems: Wrap-Up | A recap of all 20 distributed systems concepts and the complete URL shortener architecture spanning all 8 pillars. The final post in the series. |
Vector Clocks: Knowing When Events Are Truly Concurrent
The problem
Lamport timestamps tell you: if A → B, then ts(A) < ts(B). What they don't tell you: if ts(A) < ts(B), does that mean A caused B, or were they concurrent?
Consider two users simultaneously editing the same link's destination URL on different servers. Lamport timestamps will order them (one will have a lower timestamp than the other), but this ordering is arbitrary — neither write caused the other. They're concurrent. Silently picking the "higher timestamp" write and discarding the other loses data without the application knowing a conflict occurred.
Vector clocks solve this: they can distinguish between "A happened-before B" and "A and B are concurrent and both happened independently." This distinction is the basis of conflict detection in distributed databases.
The core idea
A vector clock is a list of counters, one per process, that tracks each process's knowledge of every other process's logical time. Comparing two vector clocks tells you definitively: A happened-before B, B happened-before A, or A and B are concurrent.
The analogy: tracking who heard what from whom
Three gossips — Alice, Bob, and Carol — each keep a tally of how many things each person has told them:
- Alice's tally: [Alice: 3, Bob: 2, Carol: 1] — "I've made 3 statements, heard 2 from Bob, 1 from Carol"
- Bob's tally: [Alice: 2, Bob: 4, Carol: 2]
When Alice shares her tally with Bob, Bob updates each entry to the maximum of their two tallies. Bob now knows everything Alice knows.
If Bob's tally for a dimension is higher than Alice's for the same dimension, Bob has information Alice doesn't. If both are lower in some dimensions, their states are incomparable — they've heard different things and both have newer information in some respects. That's concurrency.
How vector clocks work
Each process i maintains a vector V of length N (one entry per process). V[j] = the number of events process i knows about from process j.
On a local event at process i: V[i] += 1
On sending a message from process i: V[i] += 1, send message with current V
On receiving a message at process i with vector W: V[j] = max(V[j], W[j]) for all j, then V[i] += 1
Comparing vector clocks
Vector clock A happened-before B (A → B) iff: A[i] ≤ B[i] for all i, and A[j] < B[j] for at least one j.
Vectors A and B are concurrent (A ‖ B) iff: neither A → B nor B → A. This means A has a higher counter in some dimension and B has a higher counter in another.
Process A Process B Process C
V_A=[0,0,0] V_B=[0,0,0] V_C=[0,0,0]
A has event: V_A=[1,0,0]
A sends to B with V=[1,0,0]
B receives: V_B = max([0,0,0],[1,0,0]) + B++ = [1,1,0]
B has event: V_B=[1,2,0]
A has event (concurrent, no communication): V_A=[2,0,0]
Now compare V_A=[2,0,0] and V_B=[1,2,0]:
V_A[0]=2 > V_B[0]=1 → A has something B doesn't
V_A[1]=0 < V_B[1]=2 → B has something A doesn't
→ CONCURRENT: neither happened-before the other ✓
Application: Dynamo and conflict detection
Amazon Dynamo (and Riak, which is an open-source Dynamo implementation) uses version vectors (a variant of vector clocks) for conflict detection on writes.
When a client reads a key, Dynamo returns the value along with its version vector (a "context"). When the client writes back, it includes this context. Dynamo uses the context to determine if the write is a successor to the current value (no conflict) or concurrent with it (conflict).
Initial: key="x7Kp2", value="https://old.com", VC=[A:1, B:0]
Client 1 reads (gets VC=[A:1, B:0]), updates to "https://v2.com"
Writes with context VC=[A:1, B:0] → stored on Server A
Server A: value="https://v2.com", VC=[A:2, B:0]
Client 2 (concurrently, read the old value) updates to "https://v3.com"
Writes with context VC=[A:1, B:0] → stored on Server B
Server B: value="https://v3.com", VC=[A:1, B:1]
Reconciliation:
[A:2, B:0] vs [A:1, B:1]: concurrent (A has A:2 > B:0, B has B:1 > A:0)
→ CONFLICT: surface both values to the application for resolution
In Dynamo, conflicting versions are returned to the next reader as a list ("siblings"). The application (or the client library) resolves the conflict and writes back a merged version.
The vector clock size problem
A vector clock has one entry per process. With 100 processes, each vector clock is 100 entries. At scale, this becomes expensive to store and transmit.
Dynamo addresses this with dotted version vectors (a more compact representation) or version vectors (track client IDs rather than server IDs). Riak and newer Dynamo implementations use these to avoid unbounded clock growth.
Vector clocks vs Lamport timestamps
| Lamport Timestamps | Vector Clocks | |
|---|---|---|
| Detects A → B | Yes | Yes |
| Detects concurrency | No | Yes |
| Size | 1 integer | N integers (one per process) |
| Use case | Total order, simple causality tracking | Conflict detection, concurrent write detection |
The one thing to remember
Vector clocks track per-process knowledge so that comparing two events tells you definitively whether one happened-before the other or they're concurrent. A happened-before B means A's every counter ≤ B's corresponding counter (and at least one is strictly less). Concurrent means each has a higher counter in at least one dimension — neither subsumes the other. This is the mechanism behind conflict detection in Dynamo and Riak: concurrent writes are surfaced to the application for resolution, not silently discarded.
← Previous: Lamport Timestamps — the simplest logical clock: assigning monotonic integers to events to capture causal order.
→ Next: Distributed Transactions — ensuring atomicity across multiple nodes or services when a single ACID transaction isn't possible.




