TCP vs UDP: Reliability vs Speed at the Transport Layer

TCP vs UDP: Reliability vs Speed at the Transport Layer
Systems Design
| # | Post | What it covers |
|---|---|---|
| 00 | Networking & Protocols: How Bytes Actually Travel | Before you can design systems that scale, you need to understand how bytes actually travel. Eight concepts every backend engineer must know. (148 chars) |
| 01 | The OSI Model: The Map Every Engineer Needs | The OSI model isn't just interview theory — it's the map that tells you exactly where in the stack a network problem lives. Here's how to use it. (152 chars) |
| 02 | TCP vs UDP: Reliability vs Speed at the Transport Layer ← you are here | TCP guarantees delivery. UDP doesn't look back. Understanding why each exists — and when to reach for each — is fundamental to network design. (150 chars) |
| 03 | HTTP vs HTTPS: The Language of the Web and Its Secure Version | 301 Moved Permanently, 302 Found, 304 Not Modified |
| 04 | TLS/SSL: How HTTPS Actually Works Under the Hood | TLS is what puts the S in HTTPS. Here's how the handshake works, what a certificate actually contains, and why TLS 1.3 matters for performance. (152 chars) |
| 05 | DNS: The Phone Book That Runs the Internet | DNS is the phone book of the internet — and one of the most misunderstood layers in the stack. Here's how it works and how it fails. (133 chars) |
| 06 | DNS Load Balancing: Traffic Distribution at the Name Layer | DNS load balancing distributes traffic before a single packet reaches your servers. Here's how it works, where it excels, and where it falls short. (154 chars) |
| 07 | Anycast Routing: One Address, Everywhere at Once | One IP address, dozens of locations, zero client configuration. Anycast is how the fastest global infrastructure works — here's the mechanism behind it. (158 chars) |
| 08 | CDN: Moving Content Closer to the People Who Need It | A CDN isn't just a cache in front of your server. Here's how content delivery networks work, when they help, and when they add complexity for nothing. (154 chars) |
| 09 | Networking & Protocols: Wrap-Up | A complete recap of the eight core networking concepts — OSI, TCP, HTTP, TLS, DNS, CDN — and how they connect into a complete picture. (135 chars) |
Head-of-line blocking: TCP's hidden cost
The live scores problem from the opening is a specific instance of a broader TCP issue called head-of-line (HOL) blocking. Because TCP guarantees ordered delivery, a single lost packet stops the entire stream — all subsequent packets wait in the buffer, regardless of whether the application could use them independently.
For a file download, this is correct behaviour — you need all the bytes in order. For a stream of independent messages (score updates, sensor readings, video frames), it's actively harmful: the latest, most useful data is held back waiting for an old packet that is now irrelevant.
HTTP/2 over TCP suffers from this at the transport layer — even though HTTP/2 multiplexes multiple streams within a single connection, a lost TCP packet blocks all streams simultaneously. HTTP/3 solves this by moving to QUIC, a protocol built on UDP that implements its own reliability per stream, so a lost packet in one stream doesn't block others. QUIC is one of the clearest modern examples of "UDP with custom reliability logic" done right.
When each is the right choice
Reach for TCP when:
- Data integrity is non-negotiable — file transfers, database queries, email, API calls
- Order matters — sequential data processing, streaming protocols where gaps corrupt state
- You're building on existing TCP-based protocols (HTTP, WebSockets, SSH, SMTP)
- Your application has no tolerance for missing data
Reach for UDP when:
- Latency matters more than completeness — live video, VoIP, online gaming, real-time telemetry
- You're building a request/response protocol with your own timeout/retry logic — DNS uses UDP for exactly this reason
- You're broadcasting to multiple receivers — UDP supports multicast natively; TCP does not
- You need to implement custom reliability semantics that TCP's one-size-fits-all approach doesn't serve — QUIC, WebRTC's data channel, and many game networking protocols do this
In the URL shortener: the redirect service uses TCP (via HTTPS). Every redirect is a discrete HTTP request/response — the full payload must arrive intact and in order, and the overhead of a TCP connection is acceptable for a request that happens once per user visit. DNS lookups that resolve sho.rt to an IP address use UDP — a small, self-contained query and response where the overhead of a TCP handshake would be disproportionate and where the client can simply retry on timeout.
The tradeoffs
TCP's reliability has a latency floor. The minimum latency of a TCP connection is one round trip for the handshake plus one round trip per acknowledgement window. On high-latency connections (cross-continental, mobile networks), this floor is significant. TLS on top of TCP adds another round trip. This is why HTTP/3 and QUIC are built on UDP — they implement reliability without the sequential round-trip requirement.
UDP's freedom requires discipline. Applications that use UDP and need any form of reliability must implement it themselves. Getting this right — handling loss, reordering, duplication, and congestion — is genuinely hard. Most applications that reach for UDP without careful design either re-implement TCP poorly or create protocols that flood networks under load.
Port exhaustion is a TCP concern at scale. Each TCP connection consumes a port on both ends. A server with 65,535 available ports (the theoretical maximum) behind a NAT gateway can run out of available connections under high load. UDP, being connectionless, doesn't have this constraint in the same way.
The one thing to remember
TCP and UDP aren't better or worse — they're different contracts. TCP says: "I will get this to you, intact and in order, whatever it takes." UDP says: "I'll send this now; what you do with what arrives is up to you." The right choice depends entirely on which contract your application needs. If missing data corrupts your application's state, use TCP. If missing data is preferable to delayed data, use UDP — and be prepared to handle what the network doesn't guarantee.
← Previous: The OSI Model: The Map Every Engineer Needs — The OSI model isn't just interview theory — it's the map that tells you exactly where in the stack a network problem...
→ Next: HTTP vs HTTPS — TCP is the transport; HTTP is the language your application speaks on top of it. The next post covers how HTTP works, what HTTPS adds, and why the distinction matters for every API you build or consume.




