Skip to main content

Command Palette

Search for a command to run...

Event Sourcing: The Ledger, Not the Balance

Updated
10 min read
Event Sourcing: The Ledger, Not the Balance

Series: System Design · Architecture Patterns — Pillar 7 of 8

Systems Design

# Post What it covers
00 Architecture Patterns: How Systems Are Structured Twenty patterns covering monoliths, microservices, events, resilience, deployment, and data processing. How to structure systems that survive growth.
01 Monolithic Architecture: The Default That Gets Abandoned Too Early Monoliths are fast to build and easy to operate. Learn when they're the right choice, when they break down, and how to know the difference.
02 Microservices: The Architecture You Earn, Not Choose Microservices enable independent scaling and team autonomy — but at significant cost. Learn what you actually get, what you pay, and when it's worth it.
03 Serverless: Pay for What You Use, Not What You Provision Serverless scales to zero and charges per invocation. Learn where it shines, where it fails, and how to design around cold starts and vendor lock-in.
04 Event-Driven Architecture: Decoupling Through Events Event-driven systems communicate via events rather than direct calls. Learn how producers, consumers, and event brokers work — and the consistency tradeoffs involved.
05 Message Queues: Decoupling Produce from Consume Message queues decouple producers and consumers, enable load levelling, and provide durability. Learn how they work and when to use Kafka vs SQS vs RabbitMQ.
06 Pub/Sub: Broadcasting Events to Multiple Consumers Pub/sub decouples publishers from subscribers through topics. Learn how it differs from message queues and when to use Kafka, SNS, or Google Pub/Sub.
07 CQRS: When Reads and Writes Need Different Models CQRS separates writes from reads so each can be optimised independently. Learn how it works, when it's worth the complexity, and when it isn't.
08 Event Sourcing: The Ledger, Not the Balance ← you are here Event sourcing stores state as a sequence of events. Learn how it works, what you get (audit log, time travel), and what it costs (complexity, schema evolution).
09 The Saga Pattern: Distributed Transactions Without Locks The Saga pattern manages distributed transactions across services using compensating transactions. Learn choreography vs orchestration and when to use each.
10 The Outbox Pattern: Atomic Writes and Event Publishing The Outbox pattern solves the dual-write problem — publishing an event and writing to a database atomically. Learn how it works using CDC or polling.
11 The Circuit Breaker: Stopping Cascading Failures Circuit breakers prevent cascading failures by fast-failing calls to unhealthy dependencies. Learn the three states, how to configure them, and where to apply them.
12 The Bulkhead Pattern: Containing Failures Through Resource Isolation Bulkheads isolate thread pools and connections per dependency so one failure can't exhaust resources needed by others. Learn how to apply them in practice.
13 The Sidecar Pattern: Cross-Cutting Concerns Without Code Changes The sidecar pattern deploys a helper process alongside each service for logging, metrics, TLS, and service discovery — without modifying the service itself.
14 Service Mesh: A Programmable Network for Microservices A service mesh handles service-to-service traffic, mTLS, circuit breaking, and observability via a fleet of sidecar proxies. Learn how it works and when to use it.
15 Service Discovery: Finding Services in a Dynamic Environment Service discovery lets services find each other in dynamic environments. Learn client-side vs server-side discovery, health checks, and DNS vs registry approaches.
16 The Strangler Fig: Replacing a Legacy System Without Burning It Down The Strangler Fig replaces a legacy system incrementally by routing specific functionality to new implementations while the old system keeps running.
17 Backend for Frontend: One API Per Client Type BFF creates dedicated API backends per client type. Learn why one general API struggles to serve mobile and web well, and how BFF solves it.
18 ETL Pipelines: Moving Data from Operations to Analytics ETL moves data from operational systems into analytical stores. Learn how pipelines work, what ELT is, and how to design reliable data movement at scale.
19 Batch vs Stream Processing: How Fresh Do Your Answers Need to Be? Batch processes accumulate data then processes in bulk; streaming processes each event as it arrives. Learn the tradeoffs and when each is right.
20 MapReduce: Processing Petabytes in Parallel MapReduce processes massive datasets in parallel by splitting work into map and reduce phases. Learn how it works and why Spark has largely replaced it.
21 Architecture Patterns: Wrap-Up A recap of all 20 architecture patterns across decomposition, async communication, data patterns, resilience, and data processing. How they connect.

Event Sourcing: The Ledger, Not the Balance

The problem

Your URL shortener's links table has a destination_url column. When a user updates it, you run UPDATE links SET destination_url = 'new_url' WHERE id = 'x7Kp2'. The old value is gone. The database contains the current state; history is discarded.

A user calls support: "I changed my link's destination last Tuesday, but clicks from last month are being reported at the wrong URL. What URL was my link pointing to three weeks ago?" You can't answer — the database only knows what the link points to now.

A second scenario: a botched migration script corrupts destination URLs for a thousand links. You can restore from last night's backup, but you'll lose a day of legitimate changes. You need to roll back to two days ago selectively, undoing only the corrupted records without touching everything else.

Both scenarios require knowing the history of changes, not just the current state.


The core idea

In event sourcing, the application doesn't store the current state of an entity — it stores the sequence of events that produced that state. Current state is derived by replaying the event log from the beginning (or from a snapshot). The event log is the primary source of truth; the database table is a derived projection.


The analogy: a financial ledger

A bank account has a balance: $2,340.00. A conventional database stores this number and updates it on every transaction. History is implicit in transaction records (if you're lucky).

A ledger stores every transaction: +\(5,000 (payroll), -\)1,200 (rent), -\(340 (groceries), -\)120 (utilities). The current balance is always the sum of all transactions. The ledger never overwrites entries — it only appends. You can reconstruct the balance at any point in time by summing transactions up to that date.

Event sourcing applies this model to all state in a system. The event log is the ledger. Current state is the running total.


How event sourcing works

The event log

Event log for link x7Kp2:
  1. LinkCreated    { id: "x7Kp2", user_id: 123, dest: "https://original.com", at: t1 }
  2. DestinationUpdated { id: "x7Kp2", new_dest: "https://v2.com", at: t2 }
  3. DestinationUpdated { id: "x7Kp2", new_dest: "https://final.com", at: t3 }
  4. TagsAdded      { id: "x7Kp2", tags: ["campaign"], at: t4 }

To get the current state, replay the events:

apply(LinkCreated) → { id: "x7Kp2", dest: "https://original.com", tags: [] }
apply(DestinationUpdated) → { id: "x7Kp2", dest: "https://v2.com", tags: [] }
apply(DestinationUpdated) → { id: "x7Kp2", dest: "https://final.com", tags: [] }
apply(TagsAdded) → { id: "x7Kp2", dest: "https://final.com", tags: ["campaign"] }

Snapshots

Replaying all events from the beginning is expensive for entities with long histories. Snapshots compress the event log: periodically take a snapshot of current state, store it alongside the log. On load, start from the nearest snapshot and replay only events after it.

Events 1–1000 → Snapshot at event 1000: { full state }
Events 1001–1200 (new events)

On load: start from snapshot, apply events 1001–1200
→ Same result as replaying all 1200 events, 12x faster

What event sourcing buys you

Complete audit log. Every state change is recorded as an immutable event. You can answer any "what was the state at time T?" query. Support, compliance, and debugging all benefit.

Temporal queries. "What did the link point to on Tuesday?" — replay events up to Tuesday. "What was the user's billing tier last month?" — replay subscription events up to last month.

Event replay. Made a bug in a projection (read model)? Fix the projection code and replay all events from the beginning to rebuild it correctly. This is the "time machine" capability of event sourcing.

Natural event stream. The event log is already the correct input for event-driven systems — no need to construct events from database change detection.

Temporal decoupling. A new consumer service can bootstrap by replaying historical events. It doesn't need a data migration — just a replay from the beginning of the log.

What event sourcing costs you

Query complexity. "Get all links created by user 123 in the last 30 days" is a simple SQL query against a table. Against an event log, it requires either a full log scan or maintaining a CQRS read model (which is why event sourcing and CQRS are almost always used together).

Schema evolution. Events are immutable records in an append-only log. You cannot change the schema of an old event. When business requirements change the meaning of DestinationUpdated, old events have the old structure and new events have the new one. Your application code must handle all historical versions forever. This is a significant long-term burden.

Replay performance. For entities with thousands of events, even with snapshots, replay adds latency on entity load. Optimising this requires careful snapshot strategy and caching.

Conceptual complexity. Developers unfamiliar with event sourcing find it disorienting. Instead of "update the record", you "append an event and derive the state." This changes how you think about every state mutation.


When to use event sourcing

Strong fit:

  • Domain requires full audit trail (financial systems, compliance, healthcare)
  • Temporal queries are first-class requirements ("what was the state at time T?")
  • Complex business logic that benefits from a clear event history (order processing, subscription billing state machines)
  • You're already building event-driven architecture and CQRS

Weak fit:

  • Simple CRUD application with no audit requirements
  • Team unfamiliar with the pattern (learning curve is real)
  • Schema evolution is frequent and unpredictable (event versioning overhead is high)
  • You want to try it because it sounds interesting — without a genuine business requirement for audit or temporal queries

The one thing to remember

Event sourcing stores the history of changes rather than the current state — the current state is always a derived projection of the event log. This provides complete audit history, temporal queries, and event replay at the cost of significant query complexity (requiring CQRS), schema evolution discipline, and a conceptual shift in how state mutations are modelled. Apply it when the audit trail and temporal capabilities are genuine business requirements, not architectural preference.


← Previous: CQRS — separating the write model from the read model so each can be optimised independently.

→ Next: Saga — managing multi-step distributed transactions where each step touches a different service, and failure of any step requires compensating the previous ones.

Systems Design

Part 1 of 50

Understanding these system design concepts is essential for architects, developers, and engineers to create scalable, reliable, and maintainable software systems that meet the needs of businesses.

More from this blog

Cloud Tuned

729 posts

Your starting point for anything cloud: AWS, Azure, GCP, Serverless, Architecture, Hybrid Cloud, Systems Design and other Information Technology topics.