The Strangler Fig: Replacing a Legacy System Without Burning It Down

Series: System Design · Architecture Patterns — Pillar 7 of 8
Systems Design
| # | Post | What it covers |
|---|---|---|
| 00 | Architecture Patterns: How Systems Are Structured | Twenty patterns covering monoliths, microservices, events, resilience, deployment, and data processing. How to structure systems that survive growth. |
| 01 | Monolithic Architecture: The Default That Gets Abandoned Too Early | Monoliths are fast to build and easy to operate. Learn when they're the right choice, when they break down, and how to know the difference. |
| 02 | Microservices: The Architecture You Earn, Not Choose | Microservices enable independent scaling and team autonomy — but at significant cost. Learn what you actually get, what you pay, and when it's worth it. |
| 03 | Serverless: Pay for What You Use, Not What You Provision | Serverless scales to zero and charges per invocation. Learn where it shines, where it fails, and how to design around cold starts and vendor lock-in. |
| 04 | Event-Driven Architecture: Decoupling Through Events | Event-driven systems communicate via events rather than direct calls. Learn how producers, consumers, and event brokers work — and the consistency tradeoffs involved. |
| 05 | Message Queues: Decoupling Produce from Consume | Message queues decouple producers and consumers, enable load levelling, and provide durability. Learn how they work and when to use Kafka vs SQS vs RabbitMQ. |
| 06 | Pub/Sub: Broadcasting Events to Multiple Consumers | Pub/sub decouples publishers from subscribers through topics. Learn how it differs from message queues and when to use Kafka, SNS, or Google Pub/Sub. |
| 07 | CQRS: When Reads and Writes Need Different Models | CQRS separates writes from reads so each can be optimised independently. Learn how it works, when it's worth the complexity, and when it isn't. |
| 08 | Event Sourcing: The Ledger, Not the Balance | Event sourcing stores state as a sequence of events. Learn how it works, what you get (audit log, time travel), and what it costs (complexity, schema evolution). |
| 09 | The Saga Pattern: Distributed Transactions Without Locks | The Saga pattern manages distributed transactions across services using compensating transactions. Learn choreography vs orchestration and when to use each. |
| 10 | The Outbox Pattern: Atomic Writes and Event Publishing | The Outbox pattern solves the dual-write problem — publishing an event and writing to a database atomically. Learn how it works using CDC or polling. |
| 11 | The Circuit Breaker: Stopping Cascading Failures | Circuit breakers prevent cascading failures by fast-failing calls to unhealthy dependencies. Learn the three states, how to configure them, and where to apply them. |
| 12 | The Bulkhead Pattern: Containing Failures Through Resource Isolation | Bulkheads isolate thread pools and connections per dependency so one failure can't exhaust resources needed by others. Learn how to apply them in practice. |
| 13 | The Sidecar Pattern: Cross-Cutting Concerns Without Code Changes | The sidecar pattern deploys a helper process alongside each service for logging, metrics, TLS, and service discovery — without modifying the service itself. |
| 14 | Service Mesh: A Programmable Network for Microservices | A service mesh handles service-to-service traffic, mTLS, circuit breaking, and observability via a fleet of sidecar proxies. Learn how it works and when to use it. |
| 15 | Service Discovery: Finding Services in a Dynamic Environment | Service discovery lets services find each other in dynamic environments. Learn client-side vs server-side discovery, health checks, and DNS vs registry approaches. |
| 16 | The Strangler Fig: Replacing a Legacy System Without Burning It Down ← you are here | The Strangler Fig replaces a legacy system incrementally by routing specific functionality to new implementations while the old system keeps running. |
| 17 | Backend for Frontend: One API Per Client Type | BFF creates dedicated API backends per client type. Learn why one general API struggles to serve mobile and web well, and how BFF solves it. |
| 18 | ETL Pipelines: Moving Data from Operations to Analytics | ETL moves data from operational systems into analytical stores. Learn how pipelines work, what ELT is, and how to design reliable data movement at scale. |
| 19 | Batch vs Stream Processing: How Fresh Do Your Answers Need to Be? | Batch processes accumulate data then processes in bulk; streaming processes each event as it arrives. Learn the tradeoffs and when each is right. |
| 20 | MapReduce: Processing Petabytes in Parallel | MapReduce processes massive datasets in parallel by splitting work into map and reduce phases. Learn how it works and why Spark has largely replaced it. |
| 21 | Architecture Patterns: Wrap-Up | A recap of all 20 architecture patterns across decomposition, async communication, data patterns, resilience, and data processing. How they connect. |
The Strangler Fig: Replacing a Legacy System Without Burning It Down
The problem
Your URL shortener's Ruby on Rails monolith handles redirects, link management, analytics, billing, and user accounts. It's technically sound but five years old — before microservices, before the current team, before you understood the domain properly. Modules are tangled, tests are brittle, deployments are slow.
The redirect engine needs to be rewritten in Go for performance. The analytics module needs to be extracted to scale independently. The billing logic needs better separation.
The obvious solution: rewrite everything from scratch. One team works on the new system for six months while the old one handles production traffic. When the new system is "ready," you flip the switch and migrate everything at once.
This plan fails more often than it succeeds. The new system accumulates scope creep. Edge cases that the old system handled implicitly are missed. The flip date keeps slipping. Meanwhile, the old system keeps being patched for bugs that won't be ported, creating divergence. And when you finally flip, every hidden assumption in the old system surfaces simultaneously.
The Strangler Fig is the alternative: migrate incrementally, functionality by functionality, while both systems run concurrently.
The core idea
The Strangler Fig pattern (named after a vine that grows around a tree and eventually replaces it) migrates a legacy system by intercepting requests at a facade layer and routing specific requests to a new implementation, while all other requests continue to the legacy system. Over time, the new system handles more and more routing; the legacy system handles less and less. Eventually, the legacy system handles nothing and can be decommissioned.
The analogy: a building renovation
You need to renovate an occupied office building — new electrical, new plumbing, new HVAC. You can't move everyone out and do a complete gut renovation — the business must continue operating.
Instead: renovate one floor at a time. Move the floor's occupants to temporary space, complete the renovation, move them back to the renovated floor. The rest of the building continues operating normally. Over time, every floor is renovated. The building is now modern without ever having been empty.
The Strangler Fig does the same: migrate one module at a time, route its traffic to the new implementation, leave everything else running. Over time, the entire system is new.
How it works
The facade / routing layer
A proxy or routing layer intercepts all incoming traffic. Initially, it routes everything to the legacy system. As new implementations are deployed, routing rules are updated to send specific paths to the new service.
Migration steps (interactive diagram)
Step 1: Deploy the facade
Add a proxy in front of the legacy system. Initially, it passes all traffic through — zero behaviour change. This is the foundation; everything else builds on it.
Step 2: Identify and migrate the first module
Choose a module with clear boundaries and high value to migrate first. The redirect engine is the highest-traffic, simplest component — a good starting candidate.
Write the new implementation. Deploy it. Add routing rules to the facade: GET /r/* → new Go service.
Run both in parallel initially — shadow traffic to the new service (without returning its response) to validate it handles all edge cases before switching.
Step 3: Route traffic to the new implementation
Update the facade to serve real traffic from the new service. Monitor closely. Roll back immediately if problems emerge (routing change is reversible — just update the facade config).
Step 4: Deprecate and decommission the module from the legacy system
Once the new service is stable and has handled traffic for a validation period, remove the corresponding code from the legacy system. The legacy system shrinks.
Repeat for each module.
The facade implementations
Nginx / reverse proxy: route by URL path or header. Simple, low overhead. Works for path-based decomposition.
API gateway (AWS API Gateway, Kong): route by path, method, header. Supports authentication, rate limiting, request transformation at the boundary.
Application-level facade: a thin service that makes routing decisions based on business logic (user segment, feature flag, gradual percentage rollout). More flexible but adds a maintenance burden.
Tradeoffs
Running two systems simultaneously. During migration, both the legacy system and new implementations are running. You're maintaining two codebases, two deployment pipelines, and possibly two data models. This is temporary overhead with a defined endpoint (legacy decommission) — but it requires discipline to not let it drag on indefinitely.
Data synchronisation. If the new service needs data from the legacy database (or vice versa), data must be synchronised during the migration window. Dual writes (writing to both systems) or event streams from the legacy system can handle this.
Divergence risk. The legacy system may continue to receive bug fixes or feature additions while migration is in progress. If the new system isn't updated to match, they diverge. Freeze new features to the legacy modules being migrated.
The "permanent half-migrated" failure mode. The Strangler Fig requires discipline to complete. Teams often migrate 70% of a monolith, declare it "mostly done," and stop — leaving a hybrid system that's harder to maintain than either the original monolith or a complete microservices system. Set and enforce a migration completion date.
The one thing to remember
The Strangler Fig replaces a legacy system incrementally by routing traffic to new implementations one module at a time, while the legacy system continues running. Each migration step is small, reversible (just update the routing rule), and independently deployable. The risk of any single step is low; the risk of a big-bang rewrite is existential. The cost is running two systems simultaneously and the discipline to complete the migration rather than leaving it permanently half-done.
← Previous: Service Discovery — in a dynamic environment where service instances start and stop constantly, how do services find each other?
→ Next: Backend for Frontend — creating dedicated API backends for each client type rather than one general-purpose API that tries to serve everyone.




