Systems Design

#	Post	What it covers
00	Networking & Protocols: How Bytes Actually Travel	Before you can design systems that scale, you need to understand how bytes actually travel. Eight concepts every backend engineer must know. (148 chars)
01	The OSI Model: The Map Every Engineer Needs	The OSI model isn't just interview theory — it's the map that tells you exactly where in the stack a network problem lives. Here's how to use it. (152 chars)
02	TCP vs UDP: Reliability vs Speed at the Transport Layer	TCP guarantees delivery. UDP doesn't look back. Understanding why each exists — and when to reach for each — is fundamental to network design. (150 chars)
03	HTTP vs HTTPS: The Language of the Web and Its Secure Version	301 Moved Permanently, 302 Found, 304 Not Modified
04	TLS/SSL: How HTTPS Actually Works Under the Hood	TLS is what puts the S in HTTPS. Here's how the handshake works, what a certificate actually contains, and why TLS 1.3 matters for performance. (152 chars)
05	DNS: The Phone Book That Runs the Internet	DNS is the phone book of the internet — and one of the most misunderstood layers in the stack. Here's how it works and how it fails. (133 chars)
06	DNS Load Balancing: Traffic Distribution at the Name Layer	DNS load balancing distributes traffic before a single packet reaches your servers. Here's how it works, where it excels, and where it falls short. (154 chars)
07	Anycast Routing: One Address, Everywhere at Once	One IP address, dozens of locations, zero client configuration. Anycast is how the fastest global infrastructure works — here's the mechanism behind it. (158 chars)
08	CDN: Moving Content Closer to the People Who Need It ← you are here	A CDN isn't just a cache in front of your server. Here's how content delivery networks work, when they help, and when they add complexity for nothing. (154 chars)
09	Networking & Protocols: Wrap-Up	A complete recap of the eight core networking concepts — OSI, TCP, HTTP, TLS, DNS, CDN — and how they connect into a complete picture. (135 chars)

CDN: Moving Content Closer to the People Who Need It

The problem

Your URL shortener has grown. It's handling millions of redirects a day, with users across six continents. Your origin servers are in two regions — Sydney and Frankfurt — and they're performing well. But your analytics show a consistent pattern: median redirect latency for users in Southeast Asia is 280ms, for users in South America it's 310ms, and for users in sub-Saharan Africa it's 420ms.

These users aren't far from servers that could serve them. Singapore, São Paulo, and Johannesburg all have major data centres. The problem is you don't have servers there — and running application servers in eight more regions just for redirect lookups seems disproportionate.

This is precisely the problem CDNs exist to solve. You don't need application servers in every region. You need a cache close to users that can serve the most common responses without touching your origin at all.

The core idea

A Content Delivery Network (CDN) is a geographically distributed network of servers — called edge nodes or Points of Presence (PoPs) — that cache and serve content from locations close to users. Requests that can be served from a nearby edge cache never reach the origin server. Requests that can't are forwarded to origin, the response cached at the edge for future requests, and returned to the user.

CDNs do three things well: they reduce latency by putting content physically close to users, they reduce origin load by absorbing requests at the edge, and they provide resilience against traffic spikes and some classes of attack by distributing load across a large global network.

The analogy: a warehouse network for an online retailer

An online retailer based in Melbourne could ship every order from their Melbourne warehouse. Orders to Perth take two days. Orders to Tokyo take a week. Orders to London take two weeks.

Or they could maintain regional fulfilment centres: a warehouse in Perth, one in Singapore, one in the UK. Customers in those regions get next-day or same-day delivery. Most orders ship from the nearest warehouse. The Melbourne headquarters is only involved when an item isn't stocked regionally — a special order fulfilled from the main warehouse and used to restock the regional one.

The CDN is the network of regional warehouses. Your origin server is the Melbourne headquarters. The cached response is the stock on the shelf. A cache miss is a special order.

How a CDN works

The request lifecycle

User (Singapore)
  │
  │  1. DNS resolves cdn.sho.rt to nearest CDN edge (Anycast)
  ▼
CDN Edge Node (Singapore PoP)
  │
  ├── Cache HIT:  Cached redirect found → return immediately (< 5ms)
  │
  └── Cache MISS: Not cached → forward to origin
                       │
                       ▼
               Origin Server (Sydney)
                       │
                       │  Response returned + cached at edge
                       ▼
               CDN Edge Node (Singapore PoP)
                       │
                       ▼
               User receives response

Cache hit: the edge node has a cached copy of the response. It returns it directly — no origin involved, no cross-continental round trip. Latency is determined by the distance to the edge node, not the origin. For a user in Singapore hitting a Singapore PoP, this is typically 5–20ms.

Cache miss: the edge node doesn't have the response, or its cached copy has expired. It forwards the request to origin, receives the response, caches it according to the Cache-Control headers on the response, and returns it to the user. This request takes the full round trip to origin — but future users hitting the same edge node will get a cache hit.

Cache hit ratio

The cache hit ratio — the percentage of requests served from edge cache without hitting origin — is the primary measure of CDN effectiveness. A 95% cache hit ratio means 95% of your users never touch your origin servers.

Cache hit ratio is driven by:

Content cachability. Static assets (images, CSS, JS, fonts) are highly cacheable. Personalised responses (a feed showing your specific followed accounts) are not. For a URL shortener, redirect responses are perfectly cacheable — the mapping from short code to destination URL almost never changes.

TTL configuration. Short TTLs mean more cache misses (frequent revalidation with origin). Long TTLs mean higher hit ratios but slower propagation of content changes. The right TTL depends on how frequently your content changes and how quickly changes need to reach users.

Traffic concentration. Popular content achieves high hit ratios naturally — many users requesting the same thing keeps the cache warm. Long-tail content — many unique items each requested rarely — has lower hit ratios. For a URL shortener with viral short codes, popular links will have near-100% cache hit ratios; obscure ones may miss on every request.

Cache key design. The cache key determines what counts as "the same request." By default, the full URL is the cache key. If your responses vary by Accept-Encoding header (compressed vs uncompressed), you need Vary: Accept-Encoding in your response — otherwise compressed and uncompressed responses will collide in cache. Getting cache keys right is one of the less glamorous but genuinely important parts of CDN configuration.

Edge computing

Modern CDNs offer more than caching — they support running code at the edge. Cloudflare Workers, AWS CloudFront Functions, and Fastly Compute@Edge let you execute JavaScript or WebAssembly at CDN edge nodes globally.

For the URL shortener, this is transformative: instead of caching a redirect response and serving it from edge, you can run the redirect lookup logic itself at the edge. The short code mapping is replicated to edge nodes; lookups happen locally with sub-millisecond latency; no request to origin at all — not even for cache misses.

Edge computing blurs the line between CDN and distributed application infrastructure. It's increasingly the architecture for latency-sensitive global services where the origin-forward model of traditional CDNs adds too much latency even for cache misses.

CDN features beyond caching

Modern CDNs have evolved far beyond static asset delivery:

TLS termination. CDN edge nodes terminate TLS connections locally. The TLS handshake happens within the user's region — typically adding only a few milliseconds — rather than requiring a round trip to the origin. This alone can reduce connection setup time by hundreds of milliseconds for distant users.

HTTP/2 and HTTP/3. CDN edge nodes support modern HTTP versions even when origin servers don't. The edge speaks HTTP/2 or HTTP/3 to clients and HTTP/1.1 to origin — protocol translation as a free benefit of CDN adoption.

DDoS protection. CDN networks are large — Cloudflare's network capacity exceeds 200 Tbps. Volumetric DDoS attacks that would overwhelm an origin server are absorbed and mitigated at the edge before they reach origin. This is one of the strongest arguments for CDN adoption for any public-facing service.

WAF (Web Application Firewall). Most CDN providers offer WAF capabilities — inspecting HTTP requests at the edge and blocking malicious patterns (SQL injection, XSS, credential stuffing) before they reach origin servers.

Image optimisation. Automatic resizing, format conversion (WebP, AVIF), and compression applied at the edge, reducing payload sizes without origin-side processing.

Analytics and observability. CDN providers offer detailed request-level logs and metrics at the edge — useful for traffic analysis, bot detection, and performance monitoring before requests reach your own infrastructure.

Cache invalidation

Cache invalidation is one of the genuinely hard problems in CDN management. When content at origin changes — a URL redirect is updated, an image is replaced, a JavaScript file is redeployed — cached copies at edge nodes continue to be served until their TTL expires.

Strategies for managing this:

TTL-based expiry. The simplest approach: set TTL according to how frequently content changes. Static assets with content-hashed filenames (app.a3f9c2.js) can have TTLs of a year — the hash changes when the content changes, so the URL is new. Mutable content (/latest-prices) should have a short TTL or no-cache.

Purge APIs. CDN providers expose APIs to explicitly invalidate cached content by URL or pattern. When you update a redirect in the URL shortener, you call the CDN purge API for that short code's URL — edge nodes immediately discard their cached copy and will fetch fresh from origin on the next request. Most providers support purging by URL, by tag, or by prefix pattern.

Cache versioning (URL fingerprinting). Embed a version or hash in the URL — /static/app.v42.js or /static/app.a3f9c2.js. Deploying a new version uses a new URL, so the old cached version and the new one coexist without conflict. Old URLs gradually expire from cache. This is the standard approach for static assets in CI/CD pipelines.

Surrogate keys (cache tags). An advanced pattern where responses are tagged with logical identifiers — a product page might be tagged with product:42 and category:shoes. When product 42 is updated, all responses tagged product:42 are invalidated across all edge nodes simultaneously, regardless of their individual URLs. Fastly and Cloudflare Enterprise support this; it's particularly powerful for content-heavy sites with complex dependency relationships.

When CDNs help — and when they don't

CDNs work brilliantly for:

Static assets with high reuse (images, CSS, JS, fonts, videos)
Highly cacheable dynamic responses (redirect lookups, product pages, public API responses)
Globally distributed user bases where origin proximity matters
Traffic spike absorption (product launches, viral content, news events)
Any service where DDoS protection is a concern

CDNs add complexity without benefit for:

Highly personalised responses that can't be shared across users (a social feed, an authenticated dashboard)
Low-latency real-time data that can't tolerate even a short TTL (live auction bids, financial prices)
Services with purely regional user bases where the origin is already geographically close to all users
Very small-scale services where the operational overhead of CDN configuration isn't justified

In the URL shortener: CDN is a near-perfect fit. Redirect responses are public, identical for all users requesting the same short code, and rarely change after creation. A 3600-second TTL on redirect responses means popular links are served entirely from edge cache globally. The origin only handles new short code creations and the small fraction of cache misses. The CDN absorbs the vast majority of redirect traffic, making the service globally fast with no additional application servers.

The tradeoffs

Cache coherence vs performance. Higher TTLs mean better performance and lower origin load — but longer windows where edge caches serve stale content after an origin update. Lower TTLs mean fresher content but higher origin load and more cache misses. The right answer is almost always to use long TTLs combined with an active purge strategy for content that changes: serve everything from cache until a change happens, then purge precisely.

Vendor lock-in. CDN-specific features — edge functions, cache tags, WAF rules — are not portable across providers. Deep investment in Cloudflare Workers or CloudFront Functions creates switching costs. For commodity caching use cases, this is manageable. For edge computing use cases, evaluate the lock-in carefully before building.

Cost at scale. CDN pricing is typically based on data transfer and requests. At very high traffic volumes, CDN costs can become significant. Teams sometimes discover that serving large media files through a CDN costs more than expected and optimise by serving those assets directly from cheap object storage (S3, GCS) with CDN only for the long-tail case.

Origin shield. When a CDN has many edge PoPs and a cache miss on each goes directly to origin, a popular piece of content experiencing cache misses simultaneously across hundreds of PoPs can create a thundering herd at origin. Origin shield — a single intermediate caching layer between edge nodes and origin — ensures that cache misses from all edge nodes funnel through one location, with only one request reaching the actual origin server. Most CDN providers offer this as a configuration option and it's worth enabling for high-traffic origins.

The one thing to remember

A CDN's value is proportional to your cache hit ratio — and your cache hit ratio is proportional to how carefully you've designed your caching strategy. Adding a CDN in front of uncacheable responses gives you DDoS protection and TLS termination, but not latency reduction. Adding a CDN in front of well-designed cacheable responses with appropriate TTLs and a purge strategy gives you all three. The CDN is the last mile — the content strategy is what makes it effective.

← Previous: Anycast Routing — DNS load balancing uses different IPs for different clients. Anycast goes further: the same IP address is announced from multiple locations simultaneously, and the network itself routes each client to the nearest one. It's the technique behind some of the fastest global infrastructure in the world.

→ Next: Networking & Protocols — Wrap-up — all eight networking concepts pulled together, showing how they connect across the stack and what they set up for the pillars ahead.

CDN: Moving Content Closer to the People Who Need It

Systems Design

CDN: Moving Content Closer to the People Who Need It

The problem

The core idea

The analogy: a warehouse network for an online retailer