A guide to refactoring, replatforming and replacement

Koushik Kumar Ganeeb FBCS, Principal Software Engineer at Salesforce Inc and a data engineering expert, explores our options when it comes to modernising a large, live and legacy system.

Departing from monolithic systems is not just a technical challenge; it is a series of decisions that can make or break engineering velocity for years to come. We’ve seen teams spend years identifying the right approach, only to realise halfway through that they've chosen a strategy that doesn't align with their constraints. But the reality is, there's no standard guide. What works depends on the system's complexity, the team's capabilities, and the organisation's willingness to take on risk.

The three paths forward

When the organisation is exploring options for a legacy monolith, we have three main options: refactor it, replatform it, or replace it entirely. Each option involves different complexities and advantages, based on the suitability of the organisation’s technical architecture.

Refactoring: keeping the core system architecture but restructuring it internally; breaking apart tightly coupled modules, introducing better boundaries, and maybe extracting some services. The database mostly stays the same, at least initially.
Replatforming: lifting the application and moving it to a new infrastructure, usually the cloud, without fundamentally changing how it functions. We may containerise everything, shift from on-site hardware to AWS, or migrate from VMs to Kubernetes. The code stays essentially the same, but where and how it runs changes dramatically.
Replacing: building new systems to take over the monolith's functionality, usually module by module. It's the most expensive path upfront, but sometimes it's the only way forward when the technical debt is suffocating.

When refactoring makes sense

Refactoring works best when the monolith isn't fundamentally broken; it's just become complex over the years. Suppose the codebase is written in a well known language and well tested, and the architecture violations are more about blurred boundaries than deep structural rot. Refactoring can give us 80% of the benefits at 30% of the cost.

The incremental migration pattern can be referenced here. We start by identifying bounded contexts, areas of the code that could logically stand alone, and create new service interfaces around them while they're still part of the monolith. Over time, we extract them, strangling the monolith one piece at a time. We can iterate, learn and bail if something isn't functional. The risk profile is manageable.

But refactoring has real limits. If the monolith is written in legacy code, lacks tests, or the entire team that built it has left, then it’s probably not possible to refactor our way to success. And if infrastructure is the roadblock, refactoring the code won't solve the problems.

Replatforming: infrastructure as the key

Replatforming is appropriate when the code is fine but the runtime environment is the bottleneck. The lift-and-shift approach has negative implications in some cases, but it's often the pragmatic choice: containerising the monolith and moving it to Kubernetes doesn't give us microservices, but it does give us better resource utilisation, easier scaling and a foundation for future extraction.

The catch is that replatforming doesn't fix architectural problems. If the monolith is a tangled mess of circular dependencies, moving it to the cloud just gives us a complex setup with better uptime. We still can't deploy components independently. We still have a shared database acting as a coupling point. But we can deploy faster, scale more easily, and start introducing new services alongside the monolith without waiting for data centre provisioning.

Replacing: when clean slates are worth it

Sometimes businesses require us to cut our losses. If the monolith is written in a legacy language, making it hard to maintain, or if the data model is essentially broken, for example, replacement could be the only real option. The key is doing it incrementally in an agile environment.

For you

Be part of something bigger, join BCS, The Chartered Institute for IT.

The translation layer pattern works here. We build new services with clean boundaries and clear contracts, but accept that for a while, they'll need to interact with the old system. The translation layer sits between them, translating the messy reality of the monolith into clear interfaces that our new services can work with. Event driven architectures work well for this.

The hardest part of replacement isn't the code, it's the data. We'll almost certainly need to run dual writes to both old and new systems for a period, plus reconciliation processes to catch inconsistencies and careful migration plans and rollback strategies. We can leverage AI driven decision choices here to ensure the dual systems remain unimpacted.

Architectural trade-offs that actually matter

Every approach involves trade-offs that sound obvious in theory but become painfully concrete in practice.

Once we split a monolithic architecture into individual services, we lose the transactional guarantees that a single database provides. Yes, we can implement distributed transactions, but they're complex and fragile. Most teams end up embracing updates that settle later, which necessitates designing for failure modes that never needed thinking about before.

Microservices let teams deploy independently, which is excellent for velocity. But now operating dozens of services need running rather than a single application, and if the team isn't ready for that operational burden, moving to microservices will slow them down, not speed them up.

Microservices promise that teams can own their services end to end. But if those services all interact with the same database, nothing is really decoupled.

Migration patterns that work

A few patterns come up repeatedly in successful migrations, regardless of which path you're taking.

Don't try to extract the core domain logic first. Start with peripheral capabilities that are easier to decouple. Authentication services, notification systems, and file processing are often lower-risk components.

Build new features as services. When we need new functionality, build it as a separate service from the start. It doesn't touch the monolith's codebase; it forces us to define clean interfaces, and it gives the team experience with the new architecture before we start carving up the old one.

Invest in observability first. We cannot safely decompose a system we don't understand. Before starting to extract services, capture everything. Add logging, tracing and metrics. Map out the dependencies. Profile the performance bottlenecks. This groundwork pays dividends when things inevitably go wrong during migration.

Feature flags everywhere. We need the ability to route traffic between old and new implementations, to roll back instantly when problems emerge, and to run shadow traffic for testing. Feature flags give us that control.

Moving forward

If there's one lesson I've learned from watching migrations succeed and fail, it's that strategy matters more than tactics; if there’s no clarity on the reason for migrating and what success looks like, the result is a mess.

And the goal isn't to eliminate the monolith because monoliths are sinister. The goal is to build a system that lets our team move faster, with less risk and more flexibility to adapt. Sometimes that means microservices. Sometimes it means a well-structured modular monolith running on modern infrastructure. The architecture should serve our needs, not the other way around.