Refactoring Legacy Applications Part 1: Legacy Code

In this 3-part series of articles we’ll look at legacy code, different refactoring approaches, and finally the Strangler Pattern. In this first part, we’re focussing on legacy code. What is it exactly?

What is legacy code

One of the hardest problems I’ve faced as a software developer is working in legacy applications. There’s a bunch of rules you don’t know, decisions you don’t agree with, missing tests, etc… Everything is working against you to slow you down and it is frustrating.

A great illustration that basically says it all is this one. I first saw it in the book Clean Code:

Cartoon showing number of wtfs per minute concept

The more WTFs per minute you have, the worse the codebase you are working in. It’s pretty much self-explanatory. But what exactly is legacy code? Is it just bad code?

It’s not something that is very strictly definable. I asked this during a tech talk, and here’s a list of some of the answers:

Wordcloud of what peoiple associate with legacy code

Some of these are obviously meant to be more humorous than others, like “Code someone else wrote“ or “Downwards spiral career“, However, let’s look at the four most common definitions:

Old code: lots of old code does not have automated testing yes, but there are exceptions out there. Not all old code is automatically legacy. And also, what is old exactly? 1 year, 2 years? A month?
Old tech: it’s not because something is using an old tech stack that it’s automatically bad. Same as above: a good indicator, but not guaranteed to be legacy.
Untested code: I’ve seen codebases with badly written tests that just created more issues instead of fixing them. Don’t recommend working in those.
Undocumented code: documentation can be terrible as well. Again, not a guarantee. Documentation can even work against you if it’s not up-to-date with your latest changes.

Definition of legacy code

So if all the above is no guarantee to classify something as legacy code, what exactly is? Well honestly it’s about gutfeeling more than any objective truths:

Legacy code is code that you’re afraid to change

Sounds simple right? Legacy code is something that makes you afraid of change because:

It breaks easily.
You have no idea what it does exactly.
There are no tests to tell you if something broke.

The actual code-related problems (no automated testing, documentation, etc…) are just the puzzle pieces.

If a feeling of dread or fear washes over you every time you need to make a change or develop a new feature, then you’re working in a legacy codebase. Congratulations?

Not everything is legacy

With the above said, not everything is legacy just because you’re feeling frustrated when working with it. This is especially true for code you’re not familiar with. Here’s two things I learned:

We overestimate the complexity of unfamiliar code.
It gets better after a few months.

When you first start working on a new project, remember that you have a lot to learn. The conventions may be different on this new project than what you’re used to, but that doesn’t automatically mean they are terrible.

Also, what seems overly complex can start to seem pretty obvious after a few months because you were forced to actually do some work in that piece of code.

So before you go around screaming about legacy code at everyone, take your time and get to know the codebase.

What can we do to fix legacy code?

In part 2 of this series we’ll go over refactoring code, and how to avoid technical debt. That and more is coming up soon. Be sure to subscribe to automatically get notified of new articles!