Series: System Design · Data & Storage — Pillar 4 of 8

Materialized Views: Pre-Computing Expensive Queries

Systems Design

#	Post	What it covers
00	Data & Storage: Where Everything Lives	Where data lives shapes everything about a system. Nineteen concepts covering databases, indexing, sharding, replication, and the data structures underneath. (161 chars)
01	SQL vs NoSQL: Choosing the Right Database	SQL vs NoSQL isn't a simple choice. Learn what each type optimises for, when to use relational databases, and when NoSQL is the right call.
02	Database Indexing: The Highest-Leverage Performance Tool	Indexes are the highest-leverage database performance tool. Learn how they work, what they cost, and how to decide when to add one.
03	B-Trees & B+ Trees: The Data Structure Behind Database Indexes	Almost every database index is built on a B-tree or B+ tree. Learn how they work, why they're fast, and what this means for your queries.
04	LSM Trees: Why Some Databases Are Built for Writes	LSM trees power Cassandra, RocksDB, and LevelDB. Learn how they achieve massive write throughput and what they trade off to get it.
05	Denormalisation: Trading Storage for Speed	Denormalisation trades storage for read speed by pre-computing joins. Learn when it helps, when it hurts, and how to do it safely.
06	Database Sharding: Scaling Beyond a Single Node	Sharding splits a database across multiple nodes. Learn how it works, the strategies available, and the significant tradeoffs it introduces.
07	Data Partitioning: Choosing How to Divide Your Data	Range, hash, and list partitioning each make different tradeoffs. Learn how to divide data effectively for queries, maintenance, and scale.
08	Consistent Hashing: Minimising Resharding Pain	Consistent hashing minimises data movement when nodes are added or removed. Learn how it works and why it's fundamental to distributed systems.
09	Replication & Read Replicas: Scaling Reads and Surviving Failures	Replication copies data across nodes for fault tolerance and read scaling. Learn how primary-replica setups work and when to use them.
10	Object Storage: Unlimited Scale for Large Binary Data	Object storage handles large binary files at unlimited scale. Learn how it works, why it replaced file servers, and when to use it.
11	Block vs File vs Object Storage: Three Models, Three Use Cases	Three storage models, three different use cases. Learn what block, file, and object storage optimise for and how to choose between them.
12	Distributed File Systems: File Storage Across Many Machines	Distributed file systems spread file storage across many machines. Learn how HDFS, Ceph, and GlusterFS work and when to use them.
13	Time Series Databases: Built for Metrics and Events	Time series databases handle append-heavy metric data far better than SQL. Learn how they work and when to use InfluxDB, Prometheus, or TimescaleDB.
14	Vector Databases: Semantic Search and AI Memory	Vector databases power semantic search, recommendations, and LLM memory. Learn how embeddings work, what ANN search is, and when to use one.
15	Full-Text Search Engines: Beyond SQL LIKE	Full-text search needs more than SQL LIKE. Learn how inverted indexes, relevance ranking, and Elasticsearch make text search fast and powerful.
16	Materialized Views: Pre-Computing Expensive Queries ← you are here	Materialized views cache expensive query results as physical tables. Learn how they work, when to refresh them, and when to use them vs other approaches.
17	Query Optimisation: From Slow to Fast	Slow queries aren't always fixed by adding indexes. Learn how to read EXPLAIN output, understand query plans, and systematically make queries fast.
18	Connection Pooling: Managing the Hidden Bottleneck	Opening a database connection per request doesn't scale. Learn how connection pooling works, what PgBouncer does, and how to size your pool correctly.
19	Data & Storage: Wrap-Up	A recap of all 19 data storage concepts: SQL, NoSQL, indexing, sharding, replication, specialised databases, and how they connect in a real system.

Materialized views sit between denormalisation (schema-level redundancy) and application caching (external to the database). They're the right tool when:

The computation is SQL (a view is already the right abstraction)
The source data is in the same database
Slight staleness is acceptable (minutes to hours, not seconds)
You want the database to manage the result, not application code

The tradeoffs

Staleness. A materialized view is a snapshot. Users querying the dashboard between refreshes see data that's up to refresh_interval old. For most analytics use cases, 5–15 minute lag is acceptable. For "did my last click just register?" style queries, it isn't.

Storage cost. The materialized view stores the result set on disk — the same data that exists in the source tables, in pre-aggregated form. On large datasets, materialized views can be hundreds of GB.

Refresh cost. Full refresh re-runs the defining query. If the defining query is expensive, the refresh is expensive too — potentially as expensive as the original query, but amortised across all reads since the last refresh.

Schema coupling. Materialized views depend on their source tables. Changes to the source table schema (renaming columns, adding/removing tables from a join) require updating and refreshing the view.

Refresh contention. Without CONCURRENTLY, a refresh locks the view for reads. For dashboards with many concurrent users, this can cause visible latency spikes during refresh windows.

The one thing to remember

A materialized view is a pre-printed report: the expensive computation happens once, on a schedule, and fast reads serve the result. The tradeoff is freshness — you're serving data from the last refresh, not the latest write. When your most expensive, most frequent queries aggregate large datasets for analytics or reporting, and your users can accept data that's minutes (not seconds) old, a materialized view is often the cleanest, lowest-effort solution.

← Previous: Full-Text Search Engines: Beyond SQL LIKE — Full-text search needs more than SQL LIKE. Learn how inverted indexes, relevance ranking, and Elasticsearch make text...

→ Next: Query Optimisation: From Slow to Fast — Slow queries aren't always fixed by adding indexes. Learn how to read EXPLAIN output, understand query plans, and sys...

Materialized Views: Pre-Computing Expensive Queries

Materialized Views: Pre-Computing Expensive Queries

Systems Design

The tradeoffs

The one thing to remember

Comments

Systems Design

More from this blog

Data & Storage: Wrap-Up

Connection Pooling: Managing the Hidden Bottleneck

Query Optimisation: From Slow to Fast

Full-Text Search Engines: Beyond SQL LIKE

Command Palette

Materialized Views: Pre-Computing Expensive Queries

Systems Design

The tradeoffs

The one thing to remember

Comments

Systems Design

More from this blog