This week, we’re talking about factoring: an abstract yet critical part of software development. We’ll use imprecise (but accurate) metaphors from the physical world, deferring technical details to later posts.
First, consider a question you may never have asked yourself, as the answer seems so obvious: Why do we organize our physical world into rectilinear boxes? Bind pages into covered books, rather than merely stapling together piles of oddly shaped paper? Throw kitchenware into drawers? Why, when building a wall of clay, do we fashion the clay into bricks?
We do it because the world is complicated, and simple abstractions help us cope. Boxes are easier to understand, reference, and manipulate than sets of irregular objects, or pools of mud. Grouping items into a neatly shaped container lets us perceive them collectively as a surface rather than a volume, and that scales well: As any regular polyhedron expands, surface area grows more slowly than volume.
Intuitive though this principle seems, it can be difficult to apply in the virtual world of code and ideas, where its value is ironically greatest. The arrangement of code into logically distinct entities is called “factoring.” You’ve probably heard the term “refactoring” more often, because factoring is rarely given due attention during initial design. We too often wait until problems rear their ugly heads, then refactor in a post hoc game of whack-a-mole. By understanding a few basic principles, we can factor well a priori; and thus, like chefs who know the worth of mise en place, work more efficiently and waste less time cleaning up our own mess.
I like refried beans. That’s why I wanna try fried beans, because maybe they’re just as good and we’re just wasting time.
—Mitch Hedberg
Printed books are a comfortable waypoint on the journey to understanding. If brick walls are at one end of the physical/virtual spectrum, and software is at the other, then books are at the midpoint. The difference between surface and volume is the difference between reading a book’s cover and reading its contents: The former remains easy as we consider larger and larger books, while the latter becomes prohibitively difficult. This axiom of different growth rates applies not only in the 3D physical world, but also in an N-dimensional hyperspace whose axes are all the semantically orthogonal properties we actually care about.
In software, we call surfaces “interfaces,” and volumes “implementations.” Keeping interfaces starkly clean frees implementations to be messy, much as closing the door to your slovenly teenager's bedroom saves you from being assaulted by the sight of a pigsty every time you walk down the hall. Forcing components to communicate only through stable interfaces even lets us replace implementations entirely without disrupting the larger system. Much as a healthy economy allows no single business to become “too big to fail,” a healthy microservice architecture brooks no service that is “too big to rewrite.”
Well-meaning engineers sometimes insist that the goal of simple, stable interfaces fronting completely interchangeable implementations is unrealistic, or prohibitively expensive. Not only is it realistic, but it tremendously improves the quality of life, and the speed of ongoing work, for most developers. Engineers may be apprehensive because they’re concerned about implicit dependencies that aren’t captured by clean interfaces: What if supposedly independent components actually communicate through some back door mechanism like shared database access? It’s well worth rooting out such back doors, for reasons touched on by Steve Yegge’s classic Google Platforms Rant. For new systems, defining proper APIs up front lets us avoid back doors entirely.
Another advantage of the brick shape is that its consequences are easily foreseen, insofar as we can reason about them inductively. The human mind cannot simultaneously visualize a million distinct objects, but we can easily imagine a few Platonic tiles arranged into simple configurations, and extrapolate patterns to understand huge systems intuitively. In software, this means keeping interfaces regular and consistent, so we can think not only of specific chunks of code in all their idiosyncrasy, but of abstract, composable “components.”
Why bricks, rather than balls? Surely a shape without corners is simplest, no? Corners are a concession we make for compatibility; a sacrifice to the gods of scale. An orb in isolation may be simpler than a brick, but curved surfaces are less composable than flat ones, because they leave gaps when stacked. Constructing a stable wall of spheres would require a lot of interstitial mortar. In software, such mortar is called “glue code,” and it comes in the form of Design Patterns like Adapter, Facade, and Bridge. Glue is a necessary evil, best minimized through careful interface design.
Many parts of the pyramids are simply well-cut stone blocks, placed closely next to and atop one another. … Buildings made from stone blocks cut to fit closely together are very stable and don’t need mortar. Just stack them up carefully and they stay there.
However, in some parts of the interior, the blocks weren’t so carefully cut. That kind of precision is very expensive. Instead, the builders used more roughly cut stones and filled in gaps with rubble and mortar.
—Matt Riggsby, MA Archaeology, via Quora
Finally, let's talk about concavity. Cavities are pits in our bricks, like wormholes in an apple. Good, solid components define their own behavior and dependencies explicitly, rather than exposing hooks or extension points, sometimes called inversion of control. Certain cavities are insidiously vogue—even considered best practice—because their exorbitant cost is not recognized. (Software development and maintenance often have non-obvious cost drivers. Developers seldom get to try competing architectures in production, or conduct a double-blind study.) Regardless of the apparent upside, avoid these cavities except where truly, madly, deeply necessary:
Dependency Injection and Provider models
Flag parameters, callback functions, and policy objects
Configuration files and environment variables
Implementation inheritance
Keep your interfaces simple, regular, and free of cavities. Isolate your implementations. May your bricks be elegant, and all your walls be strong.
Simpler Bricks and Stronger Walls
I feel like each week I compliment your posts and it seems artificial, except it's not. They're really f*cking good. They're so accessible to non-software engineers and I'm learning more precise language around concepts I instinctively understand, but couldn't properly articulate. I feel like Product Managers should have to read this!!!! My instinct is initially to be like dude too much technical jargon, but when I read the context, I realize it's precision.