LLMs and tech debt

Will the future of software engineering be dominated by AI-generated technical debt?

2025-07-07 AI Software engineering

Some people are concerned that the future of software engineering will be stuck with a vast amount of AI slop tech debt. Just like a poorly managed datacenter, software will become harder to manage. That software engineers will only churn out poorly thought code with no care for system design at a rapid pace.

I disagree. I’m optimistic that code quality will improve over time.

Guard rails

I believe what we consider “acceptable code churn” will increase significantly. Engineering teams have to edit the software to adapt to the evolving business needs. It’s generally done by adapting the current code. I predict that code will be rewritten on a more frequent basis. When the business' needs change, the code has to change too. Sometimes the delta of the old and new business requirements is so large that it creates a huge tension. The answer will become more frequently to partially rewrite the code. And the team can integrate the “learnings” thus code will get better over time. Contrast with the status quo to shoehorn new features in. This is a good thing!

High code churn increases the odds of mistakes, it’s easy to introduce a new bug by accident. High code churn projects require good guard rails, aka feedback loops (observability) and test coverage. Vibe coding requires good guard rails too but for a different reason: LLMs tend to make even more mistakes than humans, at least for now. These fundamentally different reasons lead to convergence in best software engineering practices. This is good!

Good guard rails cost a lot. I built Google Chrome’s precommit testing infrastructure in 2008. Running tests before merging a change wasn’t common in 2008 but this was absolutely needed. This enabled the Google Chrome project to go from an ad-hoc release cycle in 2008 down to every 6 weeks in 2010, then every 4 weeks in 2021 and scale to over 1000 contributors doing thousands of changes per week! Yet, when is the last time have you seen a Chrome regression? It happens! It’s also rarely visible thanks to its guard rails.

I believe the cost of this kind of guard rail infrastructure will lower over time. We’ll take it for granted just like now precommit testing is taken for granted in 2025.

When rewriting a software project from scratch, often the nextgen syndrome kicks in and it takes years to get the new version to replace the old one. More often than not, Hyrum’s law causes an incredible amount of pain during the deployment phase.

Conformance

One of most recognized way to alleviate the problem is to write a conformance test suite. Nearly every software engineer hates writing that kind of code. It’s so tedious and boring! Android is famous to have contracted poorly paid people to write the Compatibility Test Suite (CTS) and they are … not great. The rule of thumb is to run the failing tests repeatedly until they pass. 🫣

You already know what doesn’t mind writing good and extensive conformance test suites: LLMs.

LLMs are also good at non-determinism by design. They can generate more randomized test cases, and when they have access to the source code, they can find issues by running analysis hundreds of times. I believe we’ll start to see a improvement in software quality and a better understanding of the undocumented and undesired software behavior with a combination of LLM-based white box understanding and black box conformance tests. Leveraging these will reduce the cost and risks to rewrite software, aka the nextgen syndrome. I believe that high performing teams will shed off legacy faster thanks to core improvements to their dev flow and tooling.

Conclusion

Thus I’m super optimistic for the future of software engineering! High performing engineering team will lean in to leverage LLMs to do what they historically hated doing: conformance testing and finding undesired behavior. This will result in higher quality services and a lower perceived sunk cost to rewrite software in-place.

Where I’m a bit less optimistic is opaque dependencies. The more a team outsources their stack, the less control they have in their destiny and ensuring conformance when transitioning from one provider to another becomes more difficult to assess. My hypothesis is that small teams will likely choose more often to have single-purpose in-house solutions versus the standard of leveraging generalized outsourced services. Specialised single-purpose software will become more appealing to generalised off-the-shelf services. The trade off will be clear, the team can better respond to evolving business needs, leveraging LLMs to reduce risks.

My takeaway is the same as in the past decades: build good guard rails and be mindful of the dependencies you take on.