Episode #6: Paying Back Technical Debt With Help From Metrics[
Previously on Tales From a Journey Around Metrics
We briefly interrupted our tale by taking a look at the 5 categories of engineering metrics. Prior to that in Episode #5, Mojtaba discovered how even bad metrics - with a decent process - could help engineering teams ask good questions.
As software engineers, we are either paying back technical debt or creating it! Setting aside the discussion of where technical debt is advantageous, many teams struggle with paying back technical debt accrued over time.
At my last company, we had a development team that was particularly struggling with a large monolithic application that had been in development for 2 decades. The engineers and engineering management struggled to allot time to tackle it, given the need to continue shipping new features. “Technical Debt” was a recurring topic at many retrospective meetings.
Why Should We Pay Back Technical Debt?
In presenting the argument that it was time to pay down technical debt, the engineering team focused on how difficult the codebase was to extend with new features, how painful it was to set up and debug, and how fixing one thing caused bugs in unanticipated areas of the code. We said that if engineers were given enough time, we would be able to clean it up, modularize it, break it up, etc.
We found that stating the problem in this manner was not effective. It was focusing on activity (of cleaning up, of modularizing, of breaking up codebase, etc.) instead of focusing on outcome. As such, it was hard for the business to see the value of paying back technical debt, while it could more easily draw a line to revenue or at least customer impact from new features.
Metrics as a Tool To Show Impact of Technical Debt
Since we had started to measure some metrics, we found that we were now able to frame the technical debt problem in terms of its impact on customers and developers. For example:
- Feature Lead Time: technical debt created manual steps, extra work, and slowdowns in getting features into production, which resulted in long lead times.
- Onboarding Lead Time: it was taking new engineers on the team a long time to get set up and be able to commit their first simple fix to production.
- Mean Time To Recover: technical debt made it difficult to debug certain incidents, causing our time to recover to balloon.
- Velocity: technical debt sometimes impacted our team’s velocity.
- Service performance: some of our services had high resource usage, high latency, and low reliability/uptime - and we could trace this back to technical debt.
It was now possible to talk to our product partners about improving these metrics and getting buy-in to do so during the team’s regular processes.
The team was no longer asking for time to modularize, to clean up, to break off services, to pay back technical debt. The team was now asking for time to improve customer experience through metrics. Furthermore, the team also had metrics to show progress related to investment (of time) in paying back technical debt.
Related to technical debt payback, we learned that:
- It is more difficult to argue for technical debt back in terms of activity (code clean up, breaking off, modularizing, etc.).
- When expressed purely anecdotal and qualitatively, the impact of technical debt on engineering teams can sometimes diminish, especially when in competition with and comparison to impact of new features.
- It can be very powerful to express the impact of technical debt in terms of metrics that product and business can more easily understand (leaving the activity as a ‘how’).
- Having metrics can also help with judging when it is time to pay technical debt back and when it is better to postpone further.
- Metrics can also help chart the efficacy of the ‘how’: did modularizing help improve the metric?
On the Next Episode
Tune in to the next episode where the engineering team still struggled to allot time for some forms of paying down technical debt, even with metrics.