Bug Extinction
Bugs are bad
There is surely too much hate in the world. But you know we can all hate together? Bugs. Estimates of the annual worldwide cost of software defects vary, but are generally on the order of trillions of US dollars. The work of finding and fixing bugs is estimated to comprise more than 60% of total software development costs. Formal studies probably underestimate the real cost by a wide margin, because they focus on quantities that can be accounted directly.
Undergraduate engineering coursework includes study of disasters that have resulted in loss of life, including watching video of the Hindenburg and Challenger explosions. Any pecuniary value assigned to such tragedies would be grotesquely inadequate.
🪤 Computer science and bootcamp students are generally not made to study past engineering disasters; yet we continue hiring these folks into engineering roles, despite (or perhaps explaining) the fact that more and more people die because of bugs. (I’m not blaming the students, or even the schools. I’m blaming industry for assigning engineering work to people who were not formally trained as engineers.)
Ripple effects that are hard to measure may in fact be dominant. For want of a bug fix, the account was lost; for want of revenue, earnings were missed; for want of investor confidence, equity was devalued; for want of incentive stock options, talent couldn’t be hired or retained; and for want of an effective team, the company was uncompetitive. The link between product quality and a company’s ability to survive and thrive should be obvious, but it’s hard to quantify, much less prove.
Let’s end them
Humanity has already eradicated (or nearly so) multiple parasites, including smallpox and polio, and (notwithstanding the anti-vax movement) we’re getting better and better at it. In fact, we’re so good at driving species to extinction that we frequently do so accidentally, primarily by killing organisms directly—what the American Museum of Natural History euphemistically calls “overharvesting”—or by destroying habitats.
Software bugs are no less susceptible to extinction than living creatures, and in principle have the same vulnerabilities.
Loss of habitat
The easiest place for bugs to hide is anywhere you have multiple sources of truth. Caches are probably the worst offenders. Caching data that may change over time is obviously a deal with the devil, as you choose to accept the wrong answer now rather than the right answer later.
Suppose a video game saves state on a server; but also caches it in local storage so that the user can start playing right away when they open the game, rather than waiting for state to be retrieved from the server. If the user ever plays on a second device, and then comes back to the first device, the local storage may be out of date. Any gameplay based on that cached state may conflict with changes from the second device. Oops, your player’s health wasn’t supposed to be that high, because they took damage while playing on the second machine. Reconciling differences like these can be tricky, and is a common source of bugs.
Conflicting truths can also come from parts of a codebase that nominally implement the same logic. Suppose we use two separate routines to sort people by height, and the first routine orders Alice, Bob, Charlie, and Dana thus:
The second routine produces a different result:
Both results are correct, because Bob and Charlie happen to be the same height. Yet such gratuitously different results can make software misbehave. We discussed a similar issue in Traversal Order.
The ideal solution to bugs like these is to build systems that are correct by construction. Having a Single Source of Truth (SSOT) guarantees that discrepancies cannot arise. This isn’t usually hard to do, but it requires a certain mindset to even recognize the problem: Redundant code and data storage are potential bug nests.
Never go to sea with two chronometers; take one or three.
—via Fred Brooks, The Mythical Man Month
“Overharvesting”
This may sound obvious, but another critical step toward eradicating bugs is to find and fix them. This doesn’t always happen, mainly because of perceived trade-offs.
Whether to prioritize bug fixes (or other quality measures) over other work is a business decision: The upside of a new feature may outweigh the perceived potential cost of quality issues. However, decision makers often mistakenly think bugs are stand-alone things. That’s like assuming the cockroach you just saw skitter across the restaurant floor lives alone. Bugs live in colonies, and every one you let live is bound to spawn others.
“The only good bug is a dead bug.”
—Edward Neumeier, Starship Troopers
Question #5 on The Joel Test is: Do you fix bugs before writing new code? That’s a pretty good litmus test for distinguishing great teams from mediocre ones. If we really want to end bugs, it’s not enough to “harvest” the obvious, urgent problems. We have to overharvest if we’re to avoid their insidious second-order effects; essentially, to stop them from multiplying.
Eliminating bugs entirely is an ambitious goal, but may be achievable. The first step is to recognize their real cost to both individual initiatives and the world at large, which is frankly staggering. The second step is to destroy (or at least stop constructing) places for bugs to hide; mainly by identifying and protecting SSOTs, and otherwise prioritizing code that is correct by construction. The third step is to hunt and kill them without ruth or delay. Finally, we must prioritize quality in sales and marketing. Customers demand lots of different things, from product features to favorable contract terms to white glove treatment from account managers; but collectively, as an industry, we have to make quality the focus of our shared conversation.