Why We Stopped Fighting the Monolith and Started Strangling It

The monolith was 2 million lines of C# code. It had been growing for 15 years. Deployments happened quarterly because nobody trusted the test suite (4 hours to run, flaky results). New developers needed 3-4 weeks of onboarding just to be productive. And every time someone changed the order processing module, something broke in inventory management for reasons nobody could fully explain.

I’m not going to pretend we had a brilliant plan from day one. We tried to rewrite it once — that lasted about two months before everyone agreed it was a terrible idea. The system was generating revenue. We couldn’t just turn it off and rebuild.

The Strangler Fig Pattern, For Real

Everyone talks about the strangler pattern in blog posts and conference talks. Here’s what it actually looks like when you’re doing it with 12 developers, a live production system, and a client who wants new features delivered while you’re migrating.

Step 1: Figure out the boundaries. We ran DDD workshops with the team — two full days of event storming sessions where we mapped out every domain concept and how they interact. This was messy. Developers who’d worked on the monolith for years disagreed about where the boundaries should be. Good. That disagreement is where the useful information is.

We ended up with 7 bounded contexts: Orders, Inventory, Customers, Payments, Catalog, Shipping, and Reporting.

Step 2: The API gateway trick. We put Azure API Management in front of the monolith. All traffic goes through the gateway. Initially it just proxied everything to the monolith. But now we could redirect individual routes to new services one by one, without the client-facing API changing at all.

Step 3: Start with the boring stuff. Everyone wanted to extract the Orders service first because it was the most complex. I pushed back. We started with Catalog — the simplest domain, lowest risk. The goal of the first extraction wasn’t to deliver value. It was to prove the pattern works and build the team’s confidence.

It took us 3 weeks to extract Catalog. We found all the problems: shared database dependencies, implicit coupling through stored procedures, a dozen places where catalog data was being accessed directly instead of through an API. Better to find those problems on the low-risk service.

The People Problem Nobody Warns You About

Three developers had been working on the monolith for 5+ years. Their entire expertise was navigating that codebase. The microservices migration was, implicitly, a threat to their value.

I split the team into three squads, each owning specific domains, and deliberately put the monolith experts across different squads. Their knowledge was essential for understanding the existing behavior we needed to preserve. Once they saw themselves as the experts who made the migration possible (not the old guard being left behind), everything went smoother.

What Went Wrong

The database. We planned to give each service its own database from the start. In practice, we ended up with a shared database phase that lasted way longer than intended — about 8 months. Splitting the data was 10x harder than splitting the code because of foreign key relationships and reporting queries that joined across domains.

Integration testing. Our initial plan was contract-based testing between services. It was insufficient. We added end-to-end integration tests, which everyone hates, but we needed them. The RabbitMQ messaging between services was a source of subtle bugs — messages arriving in unexpected order, handlers failing silently.

The feature freeze that wasn’t. The client agreed to a “reduced feature velocity” during migration. In practice, they kept requesting features at the same pace and we had to deliver them while migrating. This meant some features got built in the monolith and then immediately migrated. Not ideal, but that’s enterprise software.

What Worked

Event-driven architecture. RabbitMQ for async messaging between services was the right call. Services publish domain events and other services subscribe. When an order is placed, the Inventory service hears about it without the Orders service knowing or caring about inventory. This decoupling is the whole point.

Kubernetes on AKS. Each service deploys independently. When the Catalog service needs more capacity during a sale, it scales independently. When someone pushes a bad deploy to Payments, it rolls back without affecting anything else.

Monitoring from day one. Prometheus + Grafana for metrics, Serilog + ELK for structured logging. In a monolith, debugging is “step through the code.” In microservices, debugging is “trace the request across 5 services and figure out where it went wrong.” Without good monitoring, you’re blind.

The Numbers

After 18 months:

Quarterly deployments → daily deployments
60% fewer production incidents
45% better API response times
New developer onboarding went from weeks to days

Was it worth it? Absolutely. Would I do some things differently? Also absolutely. Start with the database split earlier. Invest more in integration testing from the beginning. And don’t let anyone tell you a migration like this is “just a technical exercise” — it’s at least half organizational.