Skip to main content

Command Palette

Search for a command to run...

Materialized Views: Pre-Computing Expensive Queries

Updated
6 min read
Materialized Views: Pre-Computing Expensive Queries

Series: System Design · Data & Storage — Pillar 4 of 8


Materialized Views: Pre-Computing Expensive Queries

Systems Design

# Post What it covers
00 Data & Storage: Where Everything Lives Where data lives shapes everything about a system. Nineteen concepts covering databases, indexing, sharding, replication, and the data structures underneath. (161 chars)
01 SQL vs NoSQL: Choosing the Right Database SQL vs NoSQL isn't a simple choice. Learn what each type optimises for, when to use relational databases, and when NoSQL is the right call.
02 Database Indexing: The Highest-Leverage Performance Tool Indexes are the highest-leverage database performance tool. Learn how they work, what they cost, and how to decide when to add one.
03 B-Trees & B+ Trees: The Data Structure Behind Database Indexes Almost every database index is built on a B-tree or B+ tree. Learn how they work, why they're fast, and what this means for your queries.
04 LSM Trees: Why Some Databases Are Built for Writes LSM trees power Cassandra, RocksDB, and LevelDB. Learn how they achieve massive write throughput and what they trade off to get it.
05 Denormalisation: Trading Storage for Speed Denormalisation trades storage for read speed by pre-computing joins. Learn when it helps, when it hurts, and how to do it safely.
06 Database Sharding: Scaling Beyond a Single Node Sharding splits a database across multiple nodes. Learn how it works, the strategies available, and the significant tradeoffs it introduces.
07 Data Partitioning: Choosing How to Divide Your Data Range, hash, and list partitioning each make different tradeoffs. Learn how to divide data effectively for queries, maintenance, and scale.
08 Consistent Hashing: Minimising Resharding Pain Consistent hashing minimises data movement when nodes are added or removed. Learn how it works and why it's fundamental to distributed systems.
09 Replication & Read Replicas: Scaling Reads and Surviving Failures Replication copies data across nodes for fault tolerance and read scaling. Learn how primary-replica setups work and when to use them.
10 Object Storage: Unlimited Scale for Large Binary Data Object storage handles large binary files at unlimited scale. Learn how it works, why it replaced file servers, and when to use it.
11 Block vs File vs Object Storage: Three Models, Three Use Cases Three storage models, three different use cases. Learn what block, file, and object storage optimise for and how to choose between them.
12 Distributed File Systems: File Storage Across Many Machines Distributed file systems spread file storage across many machines. Learn how HDFS, Ceph, and GlusterFS work and when to use them.
13 Time Series Databases: Built for Metrics and Events Time series databases handle append-heavy metric data far better than SQL. Learn how they work and when to use InfluxDB, Prometheus, or TimescaleDB.
14 Vector Databases: Semantic Search and AI Memory Vector databases power semantic search, recommendations, and LLM memory. Learn how embeddings work, what ANN search is, and when to use one.
15 Full-Text Search Engines: Beyond SQL LIKE Full-text search needs more than SQL LIKE. Learn how inverted indexes, relevance ranking, and Elasticsearch make text search fast and powerful.
16 Materialized Views: Pre-Computing Expensive Queries ← you are here Materialized views cache expensive query results as physical tables. Learn how they work, when to refresh them, and when to use them vs other approaches.
17 Query Optimisation: From Slow to Fast Slow queries aren't always fixed by adding indexes. Learn how to read EXPLAIN output, understand query plans, and systematically make queries fast.
18 Connection Pooling: Managing the Hidden Bottleneck Opening a database connection per request doesn't scale. Learn how connection pooling works, what PgBouncer does, and how to size your pool correctly.
19 Data & Storage: Wrap-Up A recap of all 19 data storage concepts: SQL, NoSQL, indexing, sharding, replication, specialised databases, and how they connect in a real system.

Materialized views sit between denormalisation (schema-level redundancy) and application caching (external to the database). They're the right tool when:

  • The computation is SQL (a view is already the right abstraction)

  • The source data is in the same database

  • Slight staleness is acceptable (minutes to hours, not seconds)

  • You want the database to manage the result, not application code


The tradeoffs

Staleness. A materialized view is a snapshot. Users querying the dashboard between refreshes see data that's up to refresh_interval old. For most analytics use cases, 5–15 minute lag is acceptable. For "did my last click just register?" style queries, it isn't.

Storage cost. The materialized view stores the result set on disk — the same data that exists in the source tables, in pre-aggregated form. On large datasets, materialized views can be hundreds of GB.

Refresh cost. Full refresh re-runs the defining query. If the defining query is expensive, the refresh is expensive too — potentially as expensive as the original query, but amortised across all reads since the last refresh.

Schema coupling. Materialized views depend on their source tables. Changes to the source table schema (renaming columns, adding/removing tables from a join) require updating and refreshing the view.

Refresh contention. Without CONCURRENTLY, a refresh locks the view for reads. For dashboards with many concurrent users, this can cause visible latency spikes during refresh windows.


The one thing to remember

A materialized view is a pre-printed report: the expensive computation happens once, on a schedule, and fast reads serve the result. The tradeoff is freshness — you're serving data from the last refresh, not the latest write. When your most expensive, most frequent queries aggregate large datasets for analytics or reporting, and your users can accept data that's minutes (not seconds) old, a materialized view is often the cleanest, lowest-effort solution.


← Previous: Full-Text Search Engines: Beyond SQL LIKE — Full-text search needs more than SQL LIKE. Learn how inverted indexes, relevance ranking, and Elasticsearch make text...

→ Next: Query Optimisation: From Slow to Fast — Slow queries aren't always fixed by adding indexes. Learn how to read EXPLAIN output, understand query plans, and sys...

Systems Design

Part 1 of 50

Understanding these system design concepts is essential for architects, developers, and engineers to create scalable, reliable, and maintainable software systems that meet the needs of businesses.