There's a reason legacy has a pejorative connotation in the industry. Legacy systems are hard to work with. They put us under constraints we didn't choose; they spend time and money toward problems we didn't endeavor to solve in the first place; they complicate the path from problem to solution. So what value do they bring?
Old Code Works
In fact, they bring a great deal of value. A simplistic definition of "legacy code" could be:
Legacy code is code that delivers more in value than it costs to maintain.
This is almost so reductive as to be meaningless; it pushes a lot of the complexity into the definition. But I often find it provides a useful perspective on the legacy systems we engage with, because it motivates us to remain humble in approaching things that have been around longer than we have.
The value may not be immediately obvious. Not everyone may even agree on the value provided. We may even think that an organization is "wrong" in a choice to preserve a legacy system. But systems rarely become legacy systems without a legacy. If you're not confident you can describe that legacy, keep looking.
The Legacy of Working Effectively with Legacy Code
One of the books that structured the 8th Light approach to handling legacy systems was Michael Feathers's 2004 book Working Effectively with Legacy Code. Feathers defends an opinionated definition of legacy code:
To me, legacy code is simply code without tests. I've gotten some grief for this definition. What do tests have to do with whether code is bad? [...] With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don't know if our code is getting better or worse.
—Michael Feathers, Working Effectively with Legacy Code (emphasis added)
We strongly agree here at 8th Light; we've been proponents of a test-first approach from our founding. We've seen how, in addition to building confidence in the behavior of code, a well-structured test suite can help drive a system's implementation toward more maintainable and understandable code. When we launch new work, we use test-driven design to kickstart projects into that virtuous cycle.
The power of the Feathers book is its pragmatism. It spends no time debating the premise that the lack of tests is a problem (the quote above is from the inside flap!). The entirety of the book is about what to do when you're not in the virtuous cycle.
The majority of the book is a playbook of common problems and solutions (e.g., "I'm Changing the Same Code All Over the Place," "This Class Is Too Big and I Don't Want It to Get Any Bigger"). It concludes with a catalog of safe, mostly mechanistic changes that can be made in the absence of tests to gain a foothold, which helps work around the chicken-and-egg problem where "untested code tends to be untestable."
Nearly 20 years after publication, Working Effectively with Legacy Code is still relevant and is still the book we use to teach this topic in our apprenticeship. This book was responsible for encouraging developers to take testing more seriously. It argued that the thing to do with code you're scared of is to find ways to become less scared of it.
The book removed the excuse that a system was "untestable" by demonstrating a bunch of ways to add tests. But it also provided a safety net and somewhat defanged the idea of working on legacy systems. And so, throughout our history, we've been able to explore the positive connotations of the word "legacy" without as much fear. Our consultants even talk about their legacy as a point of pride.
The Strength of Legacy Code and the Folly of the Easy Rewrite
There's a survivorship bias in the selection of legacy code we find ourselves talking about, which can cause problems. If legacy code is definitionally code that carries its own weight, we could talk ourselves into a sunk-cost fallacy: this code must be worth it, it's been here so long!
On the other end of the scale, when we all agree that the cost of maintaining a particular system clearly exceeds the value it delivers, nobody talks about it as "legacy code" — it gets quietly forgotten.
There is actually some theoretical foundation for this survivorship-bias approach in technology. The Lindy effect theorizes that for some artifacts — in particular technological ideas — future longevity is proportional to past longevity. Ideas that have been around longer are expected to stay around longer. This is equivalent to lifetimes being Pareto-distributed, which feels like a plausible and parsimonious model that could apply to the lifetimes of software systems:
If the Lindy effect indeed applies to systems in production, we should treat the fact that a system has existed for a long time as evidence of its fitness for purpose. This feels not just realistic, but somewhat obvious. Despite the lack of tests, "that system" that's been serving customers for 15 years has a much stronger claim at robustness and fitness for purpose than anything we would put in place today — entirely by virtue of its observed history of success.
So why are rewrites so tempting? Why are teams so drawn to finishing their analysis of a working system, dismissing it as irredeemable, and replacing things from scratch?
Part of the temptation of the rewrite is because it's much easier to engineer things from a blank page than to understand and safely change an existing system. Wouldn't it be nice to focus on a well-understood problem without being held back by unrelated problems and constraints? Empirically, we know that's not what happens with rewrites, but if it were possible, it would avoid seemingly unnecessary work.
We may fail to fully account for the cost of a rewrite because we have some unknown unknowns about the existing system. This underlines the importance of the Feathers approach to untested code: the right thing to do with valuable untested code is to find the fastest way to add tests.
Rewrites also tempt legacy code maintainers because of the planning fallacy. Humans display a consistent optimism bias when planning their own work. Because rewrites are usually much larger and less well understood than maintenance of suboptimal but robust software, this bias tends to systematically affect rewrites more than maintenance.
A Plea for Epistemic Humility
It's very hard to enumerate all the value that a complex organization gets out of a complex system, even when the functionality is well-characterized. And most of the time, the functionality is not well-characterized. This presents numerous opportunities to make mistakes if not working methodically.
I often use the parable of Chesterton's Fence to illustrate the importance of humility in new situations. Taking some creative liberties with the original 1929 quote, I usually explain it:
If you come across a fence that's getting in your way, you should figure out who put it there, and why, before advocating for its removal.
This also reminds me of the "Prime Directive" we use to remind participants in an agile software retrospective to assume good faith when talking about past events:
Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand.
—Norman L. Kerth, Project Retrospectives: A Handbook for Team Reviews
This is a methodological commitment, not a philosophical one. You don't have to believe everyone always did their best, but it helps to try to always presume the best. It focuses our analysis on systems instead of people. Rather than criticizing the behavior of individuals, we're humbly looking for the systemic factors that weighed on the design of the system.
And that can take some ethnography.
Finding the Value in Legacy Systems
Start by asking people. Ask any and all with knowledge of the system — be expansive in your conception of who a "stakeholder" is. Think of all the people the system touches. If original developers are available, get all the information you can out of them.
If the stakeholders aren't directly accessible, you'll have to get more creative. Perhaps the product is an ecommerce site or consumer-facing SaaS product. In those cases, there are more passive behavioral measurements, such as analytics and logs, as well as more active methods such as A/B testing. Product owners are great subject-matter experts to query about the value the system was designed to deliver.
If you are able to take input from people who were involved in the design or development of the legacy system: fantastic! Treat it as a valuable opportunity to learn about the system. Be curious about the decisions, constraints, and pressure that produced the thing you're working with. But also be mindful that if there are ways in which the existing system doesn't serve users' needs; the input you receive may have the same biases that produced the original.
In legacy systems that are still undergoing active development, it may be useful to look at rates of change. Which parts of the system change most frequently and least frequently? But beware that there are a number of factors that drive rapidity of change:
- Rapidly changing code could be code that's delivering a lot of value, performs a function in high demand, or gets enough attention that it is frequently tweaked. But it could just as well be code that's difficult to get right, depends on a frequently changing collaborator, or is written by developers who lean on QA to find their bugs.
- Slow-changing code could be code nobody cares about, or it could be "bedrock" that does a simple function well enough that it doesn't need to change that often.
Source-control history can be a valuable source of collateral information. Often a "why did they do it this way" question about a localized spot in the code is easily resolved when seeing the change in the context of everything else that changed with it. Be curious of the code and its authors: "why did they do it this way" should always be asked in 100% sincerity, because it always has a sincere answer.
Make sure to also look upstream of the code to go beyond the substance of what changed and start identifying motivation. Issue trackers, helpdesk systems, and development roadmaps all help illuminate the ways in which parts of an organization see and use a system.
Be very hesitant to conclude "human error." At a large scale, the function of systems is a product of the organizations and processes that produce those systems. (It's not that it isn't human error. It's quite often human error! But a diagnosis of human error does nothing to change a system meant to be staffed by fallible humans.)
It's usually easier and safer to iterate on a working system than to replace it. If making the easy change doesn't feel safe, building that confidence is the first problem to solve.
It's possible to incrementally modernize legacy systems. It's possible to make code systematically better and improve its design while incrementally building an understanding of what it does. Doing so safely almost always requires comprehensive automated tests.
Legacy code is very often bad code — it is hard to understand, hard to modify, or not up to modern standards. But where it has evolved to serve a business purpose and has spent years delivering value: no matter how you may criticize it, there is always at least one sense in which it is good code.