MATTHEW SETTER DEVELOPER AND TECHNICAL WRITER
How to Refactor a Monolithic Codebase Over Time
C O D E S H I P . C O M - B L O G . C O D E S H I P . C O M - R E S O U R C E S . C O D E S H I P. P. C O M
S ha re t h is i G uide s p Code h
About the Author. Matthew Setter is a developer and technical writer. He creates web-based applications and technical content that en�a�e developers with platforms, technolo�ies, applications, and tools. Codeship is a fully customizable hosted Continuous Inte�ration and Delivery platform that helps you build, test, and deploy web applications fast and with confidence. Learn more about Codeship here.
- 2 -
S ha re t h is i G uide s p Code h
How to Refactor a Monolithic Codebase Over Time While so many software projects start off with the best of intentions, such as a clean architecture, clear �oals, and stated objectives, not all of them do. Moreover, of the ones that do, not all of them stay that way forever. With time, feature requests, financial pressures, competin� priorities, and chan�in� developers, it is hi�hly likely that what be�an as a shinin� example of code quality, eventually becomes a monolith.
K O O B E S I H T T U O B A
Monolithic codebases are not easy to maintain. In this eBook you will learn the essentials of how to refactor a monolithic codebase.
- 3 -
S ha re t h is i G uide s p Code h
Monolithic codebases are, by their very nature, hard to maintain. This can be for any number of reasons, includin� functions that:
Do too much.
Know too much.
Have too many responsibilities.
Rely on (or too much on) �lobal state; and
Aren't testable. In these situations, if you chan�e somethin�, very often somethin� else - in a section of the application that has no seemin� connection nor association with the code that was chan�ed - breaks. For these reasons, and a host of others, the code becomes fra�ile and people commonly believe that the best course of action is to rewrite applications from scratch. However, rewrites often end up as expensive failures. Not sure why? Then ask yourself a couple of questions to uncover why.
How lon� would it take to recreate the existin� level of functionality?
Is this less or more time than removin� the technical debt in the existin� application?
Can you cope with a lack of visible forward pro�ress for that amount of time?
- 4 -
S ha re t h is i G uide s p Code h
Do you have the resources and time to fix critical and security bu�s on the old system, while rewritin� it?
Will a new system be better than the existin� one, or will it re-implement the same mistakes?
Will the new system be feature complete with the existin� one?
Can you maintain two codebases and two development teams?
Will you repeat the lessons learned from the old system? I'm not sayin� that refactorin� an existin� system is always the best choice. However, while not as flashy or attention-�rabbin�, it can often be the less costly choice and the more sane approach. With that said, I'm now �oin� to step you throu�h the essentials of how to refactor a monolithic codebase.
One: Do You Understand the Application? Before you can do anythin�, what do you know about the application? While it's often temptin� to dive on in and "just �et codin�," that's the worst thin� to do. The best thin� you can do is to learn as much as you can about it instead. If your monolith is like so many others, there's likely no one, centrally-or�anized knowled�e store. Instead,
- 5 -
S ha re t h is i G uide s p Code h
information will be stored in a wide variety of very disparate locations. These will likely include the followin�:
In the minds of previous developers, business owners, mana�ers, project mana�ers, and other stakeholders.
Code comments
Code commit messa�es.
TODOs.
One or more README files.
Code documentation.
One or more Wikis.
A bu� reportin� tool. Find as much of this information you can and brin� it to�ether, into one central location. As you're doin� this, here is a series of questions to ask to help uncover as much as you can:
Why was the application created?
Who wanted it built?
Who worked on it?
What is it meant to do?
What are its key features?
What are its additional features?
What are its top bu�s?
What are its supplementary bu�s? Hopefully, this list will inspire you to ask a host of followup questions, which will let you find out all that there is to know.
- 6 -
S ha re t h is i G uide s p Code h
Two: Is It Under Version Control? With knowled�e of the application acquired, is it's source code stored under version control? If not, then �et it under version control strai�ht-away! The last thin� that you want to do is to make any chan�es and not be able to revert them. I stron�ly encoura�e you to use Git, but Mercurial is another excellent choice if you have an aversion to Git. I'd also encoura�e you to store it in a remote repository as well, whether that's GitHub, Bitbucket, GitLab, or one of the myriads of other code hostin� services.
Three: What Is The state Of The Test Suite? Next, what level of code covera�e is in place? Dependin� on the a�e of the application, the number of developers who've worked on it (and their skill levels), the way that
- 7 -
S ha re t h is i G uide s p Code h
those developers were employed, etc., there may be no code test suite in place. If this is the case, you're �oin� to have to put a basic test suite in place before you can be�in. If you don't, you'll never be sure about the impact of chan�es that you'll make. If code covera�e is already in place, ask yourself the followin� questions:
What level of covera�e is available?
How lon� does the test suite take to complete?
Do they complete or do they exhaust available memory?
How many tests fail?
How many tests are skipped?
How many tests are out of date?
Is there a mix of unit, inte�ration, and functional tests?
Are there sections of the codebase that have no tests?
Are there comments such as "Quick Hack to be fixed later"?
Are there comments such as "Whatever You Do DO NOT TOUCH!"?
What comments exist in the tests?
Is the test suite run with full error reportin� enabled? If done well, your test suite should help you �et an understandin� of how the code works, far quicker than divin� into every class file. Take the time to read throu�h and thorou�hly understand the tests.
- 8 -
S ha re t h is i G uide s p Code h
Four: What Static Analysis Is In Place? Now that you've learned more about the application and have a handle on the test covera�e, is static code analysis used? If you're not familiar with it, static code analysis is:
The analysis of computer software that is performed without actually executin� pro�rams. In most cases, the analysis is performed on some version of the source code, and in the other cases, some form of the object code.
By re�ularly runnin� a static code analyzer over your code, such as Phan, you can help ensure that code quality is improvin�, not declinin�, and you can also trace the source of bu�s back to specific commits which introduced them. If your code isn't already usin� one, there are a host of 3rd party packa�es and online services which you can use, re�ardless of your software lan�ua�e(s). These include:
Pylint
Phan
RIPS
Splint
SonarQube
FlawFinder; and
Codacy
- 9 -
S ha re t h is i G uide s p Code h
Five: Start Refactoring Now that your team has as much information to hand as can be expected, it is time to �et started refactorin� the application. So that you do it correctly, let's discuss some sa�e advice that I came across:
There is no perfect design, there is only a better design Know that your code will never be "perfect," even if that were at all possible. While refactorin� will help you to improve it continuously, so that it is simpler, more readable, more maintainable, and more testable than it was before, the task will never end. You may always feel that you can do better, but there comes a time when you have to accept that, at least for the time bein�, it is as �ood as it can be. At this point, you have to discipline yourself to leave it and move on to somethin� else. Don't fall into the trap of "just makin� it a bit better." You've improved it. It is better than it was. Let it �o and move on.
The key principle to cleaning up a complex codebase is to always refactor in the service of a feature Refactorin� can �et a ne�ative perception if the chan�es bein� made are either trivial or just for the sake of
- 10 -
S ha re t h is i G uide s p Code h
aesthetics. However, sometimes, this kind of refactorin� is necessary and, over time, helps ensure that the quality of the codebase is better. However, if that's all that is bein� done, then the value is questionable. Instead, primarily ensure that the chan�es bein� made are for a clear and effective purpose. These can include creatin� a new feature or fixin� an outstandin� bu� or defect.
Six: Have a Refactoring Project Plan Now that you're fully aware of how the application works, it's time to start refactorin� it. However, you have to have a plan! What �oes into such a plan? This talk from Mozilla recommends five key considerations:
Break into a series of achievable tasks
Come up with a realistic timeline and resource requirements
Work on the pieces in isolation or in parallel with other projects
Staff it seriously
If workin� in parallel, account for dependencies
- 11 -
S ha re t h is i G uide s p Code h
Seven: Implement The Habit Of Opportunistic Refactoring Next, encoura�e your team to build a habit of doin� opportunistic refactorin�. If you've not heard of the term, it was coined by Martin Fowler. Here's how he describes it:
Any time someone sees some code that isn't as clear as it should be, they should take the opportunity to fix it ri�ht there and then — or at least within a few minutes. This opportunistic refactorin� is referred to by Uncle Bob as followin� the boy-scout rule — always leave the code behind in a better state than you found it.
While not a silver bullet, by re�ularly doin� little cleanups, the code's quality should always improve, and there should be no need to dedicate sprints to code cleanup, as it will be in a continuous state of improvement.
Eight: Use Dedicated Refactoring Tools One of the beauties of refactorin� is that you don't need specific tools to do it. This is because, as Martin Fowler says, if you "take small steps and test frequently," then you should be fine. That said, refactorin� manually will be a slower and take more dili�ence, yet it's still achievable.
- 12 -
S ha re t h is i G uide s p Code h
However, if you're already experienced with refactorin�, why not save yourself time and effort and make use of the tools that are built into, or available for, the major IDEs and text editors. Re�ardless of the approach you take however, remember to take it slow, and test, test, test. Secondly, as you make each chan�e, review your test suite. Does it need new tests? Have you uncovered a bu� in another part of the system that is related to what you were workin� on? Then add one or more tests for it. Are some of your existin� tests no lon�er relevant? Then remove them. Always ensure that your test suite stays up to date.
- 13 -
S ha re t h is i G uide s p Code h
Where To From Here? While this article hasn't delved too deeply into the specifics of refactorin�, it has shown a series of ei�ht principles that you can follow to approach the task correctly. It may be very temptin� to rewrite your application from scratch, but I'd caution you to hasten slowly to that conclusion. Yes, it's the trendy solution, but it's not always the correct one. Take your time, learn about your application, ensure that it's worth doin� first, and if you believe that's correct, then do so. Otherwise, step throu�h the advice that I've presented, and make the most of that treasure trove of an application, maintain continuity of service, and remove the technical debt, one line at a time.
- 14 -
S ha re t h is i G uide s p Code h
More Codeship Resources.
S K O O B E
Breaking up your Monolith into Microservices. In this eBook you will learn about the basics of "decomposin�" a monolith into microservices. Download this eBook
S K O O B E
Dockerizing Ruby Apps and Effectively Testing them. In this eBook you will learn how to dockerize Ruby applications and how to test them. Download this eBook
S K O O B E
Continuous Integration and Continuous Delivery with Docker. In this eBook we take a look at how to set up a CD Pipeline with Docker and containers. Download this eBook
- 15 -
S ha re t h is i G uide s p Code h
About Codeship. Codeship is a hosted Continuous Inte�ration service that fits all your needs. Codeship Basic provides pre-installed dependencies and a simple setup UI that let you incorporate CI and CD in only minutes. Codeship Pro has native Docker support and �ives you full control of your CI and CD setup while providin� the convenience of a hosted solution.
Codeship Basic
Codeship Pro
A simple out-of-the-box Continuous
A fully customizable hosted
Inte�ration service that just works.
Continuous Inte�ration service.
Startin� at $0/month.
Startin� at $0/month.
Works out of the box
Customizability & Full Autonomy
Preinstalled CI dependencies
Local CLI tool
Optimized hosted infrastructure
Dedicated sin�le-tenant instances
Quick & simple setup
Deploy anywhere
LEARN MORE
LEARN MORE