Recently, I encountered a conversation about "clean code." This was in the context of an engineering organization trying to improve the quality of its output. Because I am a grumpy old contrarian, my first question was, "Why improve the quality?" It seems axiomatic to me that the software we write should serve its customers—either the users or the businesses who commission it. What end does "quality" software serve? What value does "quality" provide, and why do engineers or businesses care about it at all?
This organization is not alone. There are lot of companies (and a lot of engineers) trying desperately to quantify "quality" software metrics. A lot of companies (every company?) claim to value "high quality software." They aim to produce this high-quality software with a list of metrics or architectural guidelines that developers can slavishly follow toward "Clean Code." Rules and guidelines, even well-intentioned ones like Sandi Metz's 5 Rules, are doomed to fail because they try to combat company cultures that have already decided on what they value in software—and it ain't "Clean Code" or "Software Quality."
If you are writing down "rules" and insisting that developers abide by them, it's probably because your developers are continuously doing things you wish they wouldn't. Usually, this isn't because your developers don't understand "the rules" and / or don't like you—it's because they know what the organization values, and those values are in conflict with your "rules," and they're trying to deliver that value.
Codifying a bunch of rules (especially if they aren't currently being followed) in order to move your organization toward "quality software" is picking a fight between your company's culture and values and your software development practices—and in a fight between culture and anything else: Culture will win. Every. Single. Time.
"Quality" is a proxy for "Value"
"Quality" is in the eye of the beholder. Successful code, valued code, is code that solves the business problem. I'm not sure that there's a universal definition of "good code"—but I suspect that engineers will whine less about code that is easy for them to work on. (Note: engineering whining is asymptotic to zero.) I'm not sure that's useful as a universal quality metric, since what is "good" is going to vary from team to team. Sure, there are a couple of things that are generally pretty reliable as indicators, but in reality there is no valid business judgement of code other than "adds value," and there's no way to tell if something "adds value" before it's in actual use. Code that is an incomprehensible mess but makes the company $10M a year or fulfills some critical function is obviously better than the most beautiful, artistic architecture that runs only on a developer's laptop.1
The main thrust of "agile development" is to deliver the value provided by software to the user as quickly as possible. The whole idea of "coding standards" uninformed by software value is anti-agile, and there are no external metrics for code that make sense in all circumstances.
In other words: It is impossible to write guidelines for what a piece of quality software looks like before you have deployed that software.
There's a military axiom—"You can dictate results or you can dictate process, but not both." Now, dictating process is absolutely the right thing to do when you have a lot of hands working on a problem and high consequences of failure. In fact, if the consequences of failure are greater than the rewards of success, you should absolutely dictate process and optimize against failure. Good examples of this are airplanes and medicine. A failure in medicine means that you die, while a success means you are cured of a disease that may or may not kill you.2 A failure in an airplane means your airplane crashes and everyone aboard dies, while success means that you get to Sheboygan on time. Clearly the failure is much "more bad" than the success is good!
So, why aren't more companies "agile?" Why are so many businesses interested in making a list of rules to follow so that developers can produce "good code" that "makes money?"—and why does this fail so often? A list of rules that developers must follow is dictating process. It is optimizing against failures that might happen with a checklist of things that will prevent those failures. This is the wrong approach for agile development, which requires that we embrace failure as a learning experience, abandon the dictation of process and rules, and focus on the results—on the value delivered by working software.
The problem with abandoning the fear of failure is that it requires trusting our developers, a lot. It requires there be a generative culture and an organization that is optimized toward success rather than optimized against failure. It requires profound vulnerability on the part of leadership, and an act of faith in our people that can be very difficult to perform, because it requires the ultimate act of confidence in ourselves. "I have done a good enough job hiring and training my engineers that I can trust them to do well without my direction." Agile requires leaders not managers—and leadership is way, way, way harder to perform than management.3
Real software quality is not a set of shared rules, but a set of shared values.
What a pile of unmaintainable code says about your actual values
It's pretty likely that if you've read this far, you're either thinking "Hell yeah!" or "But if I don't tell my developers to
<DO TASK>, then they won't do it and our software will be junk / hard to maintain / won't work."
Here's what I propose: When we find ourselves in the second position, it's because we, as software leaders, are confusing our own incentives for the values of the company. Don't panic—it's not only common, it's healthy! We value quality software because quality software is good for the business. We want to deliver the best thing that you can, and if we can control what our developers do at every point in time, then we can ensure that it's done right.
Interrogate the sort of failure that dictating a lot of processual checkpoints like "No nullable Types" would prevent. In fact—software quality checks in most line of business software are actually designed to insulate engineering managers from the failure mode of "Executive encounters a bug and yells at everyone," or, "Staff Engineers can't type everything."
Look at all the times developers have broken your well-thought-out "best practices" and carefully constructed rules. Do you think they did it because they are lazy? Because of a lack of knowledge?
Or, do you think that, faced with a conflict between a written rule and the value they are expected to deliver, they chose their values? Would you stop on the interstate to save a stranded puppy? Even though the there's a rule against stopping on the interstate?
You can tell your values by the rules that you're willing to break.
Agreeing on Values instead of dictating Rules
Luckily, there's a way to avoid getting yelled at by the CEO—instead of performing consequence avoidance, let's re-frame that into what success would have resulted in our executive writing glowing performance reviews and issuing bonus checks. Let's phrase it as, "Users (particularly CEOs) do not encounter bugs that prevent them from accomplishing their tasks on the site."
I mean, "no bugs" is probably both technically impossible and far too expensive to achieve. Even something like an airplane has bugs—what we want is to minimize the impact of those bugs.
If Minimize Impact of Defects on Users is the value that we want to have, then we need to make sure that what we think of as "Software Quality Metrics" actually work toward that value, and we want to make no rules that get between that goal and our developers.
You probably can't imagine ahead of time rules will get in the way of that goal, so think of it this way:
What rules do you break, and in service of what value?
When you break a rule (and you will), ask yourself "why?" Why was this a good time to break the rules? It might be that "Users don't encounter bugs" is not actually a thing your organization cares about. That's fine. I've definitely worked in situations where the primary goal was to get users to call on the phone. A lack of functionality in the website was actually good if it didn't frighten the user off entirely.
It is both possible and likely that your stated values are not your actual values. I worked for a company that talked a lot about "customer value"—but when it came right down to it, they were willing to cut any corner and bend any previously existing guideline in order to deliver Some Feature to production by tomorrow. Their real value was Speed of Feature Delivery. In fact, that was the right value for them to have, because they didn't really have a clear picture of what value looked like to their customer. Delivery Speed is a perfectly fine value to have, especially if you're trying to rapidly find a path forward or firm up a product plan! The point of this example is not to criticize the company in question, but to demonstrate in stark terms: Your values are the rules you break. Know this and accept this and you'll avoid a huge amount of cognitive dissonance when asking delivery teams to cut quality corners for speed of delivery, or to spend more time on a feature that's releasable, if not quite bug free enough to avoid an executive yelling session.
Rules to find your Values with
All of that said, while I don't believe there are any "rules for quality software" that can be said to be universal, I think there are a couple of "qualities of software" that are widely applicable enough to be useful—places where breaking the rules can be instructive rather than an offense to punished.
That is: You should break these rules too! And when you do, you should think about the larger value that you are serving.4
You should be able to debug a user-impacting crash in production within 10 minutes. This doesn't necessarily mean you need to have it solved within 10 minutes, but a developer unfamiliar with the specific code (but generally familiar with the code base) should be looking in the right place and be able to have a general idea of what is happening within 10 minutes.
The Rule To Break: Build Observability into your Code
You should be able to write a test against any user-encountered bug within 1 day. You should be able to write this test with a minimal understanding of the method under test. This is why it is so critically important that developers Write Tests First—internalize and believe "If you can't write a test, you don't understand it well enough to write the feature."
The Rule To Break: Write a test for every feature BEFORE you write the feature
In metallurgy, malleability is the ability of a metal to bend without breaking. Your code should have this property. A developer familiar with the ecosystem but unfamiliar with the system under test should be able to fix a bug discovered in production code within 2 days. A bug that requires more than 2 days to fix is an indicator of technical debt—a place where you bent the rules for some other value. What was the reason that you did that? Is that value still applicable here, or are our values in conflict?
The Rule To Break: Make your code easy to read and easy to change
4. Slow is smooth and smooth is fast.
In the context of its military origins, this means that you don't sprint places, instead you do that bad-ass Spec Ops walk like they do in Tom Clancy movies. If you want this in less violent terms: "Slow and steady wins the race." If you encounter bumps, fix the bumps, which will increase your speed. Work smoothly and steadily, removing obstacles that you would otherwise need to vault over as you go. As you progress, you will find your development team purposefully walking toward their objective at a steady, predictable place rather than trying to run around obstacles and doing heroic-but-hard-to-pull-off parkour development maneuvers. Delay the feature to fix the obstacles to releasing it.
The Rule To Break: Fix obstacles to speed as you encounter them
The last rule is to break all the rules
One of the things about optimizing against failure is that "rules" only look backward. You'll notice that I gave a lot of squishy definitions of code properties, without giving a lot of specific advice like "Make your methods 5 lines long," "Use types at boundaries," "Don't Mock more than 1 thing per test." This is intentional. When you optimize against failure, you can only optimize against known failure modes. You can keep a list of known failure modes and for every failure mode, you can make the fix A Rule You Must Follow. Again, this is a fine solution when you have a very clear picture of the path forward and a good handle on what catastrophic failure modes look like. It's pretty likely that your line-of-business software doesn't have this, so don't pretend you do. Now, at some point you'll find a bunch of repeated failures, and you'll want to optimize against them, and you'll wind up with rules like "Every Exception that isn't caught in the same method it's thrown in should trigger an alert." Or "Use JSON Schema to verify request shape on the first line of a receiving controller." That's fine—but these "rules" are situational to your software, not platonic ideals of what "quality" looks like.5
Those are the rules. You will break them—and when you do, ask yourself why you're breaking them. Why didn't I write a test for that method? Why can't I observe this bug in production? Why do we only deploy on odd numbered Tuesdays?6
The answers to those questions are the things that you really value. Make new rules that capture your organization's values, and as you break those rules (because you will), you will empower your engineers to deliver software of real value to your organization.
Of course, the pile of computed COMEFROMs you're probably thinking of would be more valuable as a well-factored and maintainable system, but why is that exactly? I don't think it has much to do with COMEFROM being a bag of snakes. Lots of things are bags of snakes. What is the specific disconnect between liberal use of COMEFROM and valuable software? ↩
Note the difference in medical responses for things that are pretty sure to kill you. If you have a serious cardiac issue, medical professional will throw absolutely deadly amounts of electricity through your body in order to get your heart back into rhythm. This will burn your chest, reboot your brain, and hurt like hell. ↩
If you want to build a ship, don't drum up people to collect wood and assign them tasks and work, but rather teach them to long for the endless immensity of the sea.
– Antoine de Saint Exupery
Note: being unable to follow the rules is not the same as willingly breaking them (although it may mean the same thing). The 10 minutes bug rule in particular can be a hard one to hit when you're just starting out. Did you break the rule if it took you 20 minutes to find the bug, or did the person before you? Do not attribute malice to the incompetent fool whose name appears in the change log—instead think about the situation that led them to break the rule, and ask them about it. When it is inevitably your own name that appears in
git blame, then you can question why you broke the rule. ↩
All of these examples are "rules" that I try to follow. They are rules where I have optimized against my own set of failures. Sometimes I break them. ↩
If a lot of these "rules" look vaguely familiar to you, it's because they're almost identical to the "DevOps" best practices from Accelerate. Accelerate is the actual scientific measurements of how "high performing" software teams can best add value to their organizations. You should read this book. You'll find a notable lack of inside-baseball architectural opining, and a fantastic guide to charting a path to providing software value. ↩