DNS Load Balancing: Traffic Distribution at the Name Layer

Series: System Design · Networking & Protocols — Pillar 2 of 8
DNS Load Balancing: Traffic Distribution at the Name Layer
Systems Design
| # | Post | What it covers |
|---|---|---|
| 00 | Networking & Protocols: How Bytes Actually Travel | Before you can design systems that scale, you need to understand how bytes actually travel. Eight concepts every backend engineer must know. (148 chars) |
| 01 | The OSI Model: The Map Every Engineer Needs | The OSI model isn't just interview theory — it's the map that tells you exactly where in the stack a network problem lives. Here's how to use it. (152 chars) |
| 02 | TCP vs UDP: Reliability vs Speed at the Transport Layer | TCP guarantees delivery. UDP doesn't look back. Understanding why each exists — and when to reach for each — is fundamental to network design. (150 chars) |
| 03 | HTTP vs HTTPS: The Language of the Web and Its Secure Version | 301 Moved Permanently, 302 Found, 304 Not Modified |
| 04 | TLS/SSL: How HTTPS Actually Works Under the Hood | TLS is what puts the S in HTTPS. Here's how the handshake works, what a certificate actually contains, and why TLS 1.3 matters for performance. (152 chars) |
| 05 | DNS: The Phone Book That Runs the Internet | DNS is the phone book of the internet — and one of the most misunderstood layers in the stack. Here's how it works and how it fails. (133 chars) |
| 06 | DNS Load Balancing: Traffic Distribution at the Name Layer ← you are here | DNS load balancing distributes traffic before a single packet reaches your servers. Here's how it works, where it excels, and where it falls short. (154 chars) |
| 07 | Anycast Routing: One Address, Everywhere at Once | One IP address, dozens of locations, zero client configuration. Anycast is how the fastest global infrastructure works — here's the mechanism behind it. (158 chars) |
| 08 | CDN: Moving Content Closer to the People Who Need It | A CDN isn't just a cache in front of your server. Here's how content delivery networks work, when they help, and when they add complexity for nothing. (154 chars) |
| 09 | Networking & Protocols: Wrap-Up | A complete recap of the eight core networking concepts — OSI, TCP, HTTP, TLS, DNS, CDN — and how they connect into a complete picture. (135 chars) |
A client that received an IP address before it was removed from DNS will continue using that IP until the connection drops or the client re-resolves. DNS cannot recall an IP address it has already handed out.
DNS failover
A specific application of health checking: a primary record is returned under normal operation; if the primary fails health checks, a secondary record is returned automatically.
Primary: api.example.com → 203.0.113.1 (primary data centre)
Failover: api.example.com → 203.0.113.99 (DR site, activated on primary failure)
This is commonly used for disaster recovery — the DR site runs cold or warm, receives no traffic under normal operation, and is activated automatically if the primary becomes unreachable. Route 53's failover routing policy implements this pattern directly.
In the URL shortener: GeoDNS directs users to their nearest regional server for low-latency redirects. Health checking removes a region from DNS responses if its servers fail. Weighted DNS enables gradual rollout of application changes across regions. All three work together, layered on top of the DNS resolution mechanism from the previous post.
The fundamental limitations of DNS load balancing
Understanding where DNS load balancing falls short is as important as knowing where it works:
No session affinity. DNS returns an IP. The client connects to that IP for some period, then may re-resolve and get a different IP. There is no mechanism at the DNS layer to ensure a returning client always reaches the same server. For stateless services (like our URL shortener) this is irrelevant. For stateful applications, it's a meaningful constraint.
Caching undermines dynamic distribution. DNS load balancing depends on clients frequently re-resolving to pick up changes. Caching — both at the OS level and the recursive resolver level — means many clients hold the same IP for minutes or hours. The actual traffic distribution across servers often doesn't reflect the DNS records as closely as theory suggests.
No connection-level awareness. A load balancer can see that server A has 500 active connections and server B has 50, and route the next request to server B. DNS cannot. It distributes queries, not connections — and the relationship between DNS queries and actual server load is loose at best.
TTL-bound failover. Even with aggressive health checking, DNS failover has a floor determined by TTL. A server can fail, be detected as unhealthy within 10 seconds, and removed from DNS — but clients that cached the IP will continue sending traffic to it for up to the TTL duration. For a 60-second TTL, that's a 60-second window of failed requests before all clients have re-resolved.
DNS load balancing vs application load balancing
These aren't alternatives — they operate at different layers and serve different purposes:
User
│
▼
DNS Resolution ← DNS load balancing operates here
│ (geographic routing, round-robin, weighted)
▼
Load Balancer ← Application load balancing operates here
│ (health checks, session affinity, algorithms)
├──► Server A
├──► Server B
└──► Server C
DNS load balancing directs clients to the right load balancer or cluster — handling geographic distribution and coarse-grained traffic shaping. Application load balancing distributes requests across individual servers within a cluster — handling fine-grained routing, health-aware distribution, and connection management.
A complete architecture uses both: DNS routes a European user to the European cluster, then an application load balancer within that cluster distributes the request across individual servers.
The tradeoffs
Simplicity vs sophistication. DNS load balancing requires no additional infrastructure — just DNS records and optionally health checks on a managed provider. Application load balancers require provisioning, configuration, and maintenance. For small teams or early-stage services, DNS load balancing delivers geographic distribution with minimal operational overhead.
Speed of change. DNS load balancing changes are bounded by TTL. Application load balancer changes are instantaneous. For anything that needs rapid response — circuit breaking, immediate failover during an incident — DNS is too slow alone.
Cost. DNS queries are cheap. Load balancers are priced by connection-hours and data processed. For pure geographic routing between regions, DNS load balancing does the same job at a fraction of the cost of running a global load balancer in every region.
The one thing to remember
DNS load balancing distributes traffic before it reaches your infrastructure — which makes it cheap, scalable, and geographically powerful, but also slow to react and blind to real server health. Use it for geographic routing and coarse traffic shaping. Use an application load balancer for everything requiring per-request awareness, fast failover, or session management. The two layers complement each other — neither replaces the other.
← Previous: DNS — TLS authenticates the server once you've connected to it. But how does your client find the right server's IP address in the first place? DNS is the system that answers that question — and its failure modes are some of the most far-reaching in all of infrastructure.
→ Next: Anycast Routing — DNS load balancing uses different IPs for different clients. Anycast goes further: the same IP address is announced from multiple locations simultaneously, and the network itself routes each client to the nearest one. It's the technique behind some of the fastest global infrastructure in the world.




