Episode #7: Indirectly Measuring Impact of Technical Debt

[ guest-post  metrics  series-managing-metrics  ]

The One Where We Used Metrics to Increase Autonomy of Development Teams

Previously on Tales From a Journey Around Metrics

We last interrupted our regular Managing Metrics episodes by digging into engineering service level metrics. Before that, in Episode 6, Mojtaba covered how metrics can help with measuring impact of technical debt and its payback. However, it is not always easy to tie technical debt payback directly to a metric that business and product can support and understand.

Executive Demands Visibility Into All Development Work

In the last organization, the business unit executive demanded visibility into each development team’s backlog and work in progress. This demand stemmed from a desire to know how the engineering work was impacting business. The question asked was:

“How do we know that engineering is investing its time and efforts appropriately to help our customers?”

Important But Hard to Measure Engineering Work

With regards to certain types of work, development teams had a hard time articulating directly how these added business value:

  • Keeping packages and infrastructure up to date
  • Cleaning up dead code
  • Completing production ready checklist items
  • Refactoring
  • Fixing bugs that were annoying to developers (but had not been raised by customers)
  • Spending time learning about new technologies, frameworks, programming languages, and design patterns

Sometimes it is not possible or cost effective to measure the impact of the above and similar work on the business directly. Yet, these and similar items are very important to the health of the development team and the software they maintain. What is the team to do?

Approach A: Include such work as a development tax

Some teams can include the above category of engineering work as a kind of ‘tax’ on top of feature and bug fixing work. That is, each developer automatically adds a buffer to their estimates in order to take on some of the work above.

Pros:

  • Does not need explicit approval to get the important work done
  • The engineers and team have autonomy to spend this ‘tax’ as they see fit
  • It looks like the team is only delivering features and bugs

Cons:

  • There is a lack of transparency here that for larger ‘tax’ items can be troublesome
  • Due to lack of transparency, there can be a reduction in learning by others
  • Given that most estimates are optimistic, most engineers would eat into this ‘buffer’ when their feature and bug fix estimates are found to be insufficient
  • This approach inherently signals a lack of trust

Approach B: Explicitly setting aside a % of team capacity for tech debt and other engineering work

A more transparent approach is to measure and set aside an agreed upon % of engineering capacity of a team for work whose direct impact to business is difficult to measure.

Pros:

  • Transparent investment of time into important work
  • Team has autonomy to invest the % as it sees fit

Cons:

  • Needs the team to have some measure of its capacity and workload (otherwise the % set aside for tech debt will give way to features and bugs)
  • This % dedicated can be viewed as ‘optional’ and can be reduced by management

Executive Pushes Back

In the last org, we opted to take approach B but ran into pushback from the executive:

“How do we know the team is investing the % appropriately?”

The way we handled this was to agree with the executive and stakeholders on the key metrics they and customers care about. We then proposed that if these metrics over time are improving, then it stands to reason that the team is spending the % wisely. In other words, if the team is delivering features and bug fixes as well as over time improving its other metrics, it must be using the % allotted wisely.

Metrics To Measure Indirect Impact

It can vary from team to team but we used these metrics in support of the % of engineering capacity being non-feature and non-bug fix:

  • 4 DORA metrics: if the team is improving these over time, while it is delivering features and bug fixes, one can assume that the capacity % dedicated (to technical debt) is being used for such a purpose
  • Service level metrics: if these are improving over time (e.g. cloud costs go down or stay constant as load increases) while the team is delivering features and bug fixes, they must be appropriately using the % dedicated to technical debt
  • Team performance metrics improving: if velocity or Cycle Time is improving over time with the same team and same sizing, then the team is investing its dedicated capacity well
  • New team member onboarding time: if this is improving over time or staying constant as the team and its services grows, it shows that the team is putting the capacity % dedicated to technical debt to good use

Lessons Learned

At the last org we learned that teams can increase their autonomy of what they work on:

  • Not all technical debt payback can have an easy direct metric or business justification
  • It is good to set aside a % of a team’s capacity towards technical debt payback
  • It is then important to measure team’s capacity and workload (in order to properly set aside a %)
  • It is then possible to use proxy metrics to show that the % dedicated for the team to invest on its own is invested wisely

[ Guest post from Mojtaba Hosseini ]

Written on June 13, 2022