Longer methods are more likely to need to change when the application changes.
The Longer Story
Shorter methods protect my investment in the code, in a number of ways.
First: Testability. I expect there to be a correlation between method length (number of “statements”) and the number of (unit) test cases required to validate the method completely. In fact, as method size increases I expect the number of test cases to increase faster than linearly, in general. Thus, breaking a large method into smaller independent methods means I can test parts of my application’s behaviour independently of each other, and probably with fewer tests. All of which means I will have to invest less in creating and running the tests for a group of small methods, compared to the investment required in a large method.
Second: Cohesion. I expect there to be a strong correlation between method length and the number of unknown future requirements demanding that method to change. Thus, the investment I make in developing and testing a small method will be repaid countless times, because I will have fewer reasons in the future to break it open and undo what I did. Smaller methods are more likely to “stabilize out” and drop out of the application’s overall churn.
Third: Coupling (DRY). I expect that longer methods are more likely to duplicate the knowledge or logic found elsewhere in the same codebase. Which means, again, that the effort I invest in getting this (long) method correct could be lost if ever any of those duplicated pieces of knowledge has to change.
And finally: Understandability. A method that does very little is likely to require fewer mental gymnastics to fully understand, compared to a longer method. I am less likely to need to invest significant amounts of time on each occasion I encounter it.
The Open-Closed Principle says that we should endeavour to design our applications such that new features can be added without the need to change existing code — because that state protects the investment we’ve made in that existing, working code, and removes the risk of breaking it if we were to open it up and change it. Maybe it’s also worth thinking of the OCP as applying equally to methods.
Like most developers I know, I have used code smells to describe problems in code since I first heard about them. The idea was introduced by Kent Beck in Fowler’s Refactoring back in 1999, and has taken root since then. The concept of code smells has several benefits, not least the fact that it gives names to ideas that were previously only vague. Having a list of named code quality anti-patterns helps all of us discuss them on the same terms.
But while I was writing Refactoring in Ruby with @wwake, and writing Reek at the same time, I began to feel a little uneasy about them. I was never able to put my finger on exactly why that was, or what I was uneasy about, but the feeling never went away. This year I think I have finally understood what I think about code smells, and why I think we can do somewhat better. So before reading on, take a moment to list the things you don’t like about them. Then let’s compare lists. Go ahead, I’ll wait.
Ok, done that? Here are my current reasons for wanting a different tool for describing code quality:
- The names aren’t all that helpful for people unfamiliar with the concepts. If you had never heard of them, what would you make of Feature Envy, Shotgun Surgery, Data Clump etc? Sure, the names are memorable, but that only helps with hindsight, for people who have taken the time and trouble to investigate and learn them.
- Some code smells can overlap. For example, I’m often unsure whether I have seen Feature Envy or Inappropriate Intimacy, Divergent Change or Large Class. There’s a sense in which this doesn’t matter, of course; but it undermines their use as a communication tool.
- Some of the smells can be subjective or contextual, leaving the quality of the code open to interpretation. For example, how large is a Large Class or a Long Parameter List?
- Some of the smells apply only under certain circumstances. For example, a Switch Statement is perfectly fine at the system boundary when we are figuring out the type of an incoming message, but often bad news when it represents a type check on code we own. And it can be acceptable at the system boundary to grab the fields of a Data Class in order to display them, while elsewhere that might be seen as Feature Envy.
- The list of code smells is not canonical; different people have added their own smells to Beck & Fowler’s original list. This situation is even worse when it comes to smells in unit tests; try looking for a canonical list of test smells and you’ll find no consensus whatever. In my opinion, this fact alone completely undermines the idea that code smells form a pattern language for describing code quality.
- There are no clear and obvious code smells covering some dynamic problems, such as the coupling between variables whose values depend on each other, or the problems introduced by mutable objects.
Did you have the same list, or something similar?
So, what can we do about it? I think the answer lies with Connascence. This is an idea introduced by Meilir Page-Jones in two books in the 1990s, and later popularised by @jimweirich in a series of conference talks. I’m not going to cover Connascence in detail here — you can find it all for yourself by reading @jimweirich‘s articles or looking at my summary slides. I just wanted to take a moment to write down my current opinions about code smells. I’ll probably write in more detail about Connascence in the coming weeks, but for now what do you think?
In response to my Developing a Sense of Smell sesion at XPDays Benelux this week, Patrick Debois has begun collecting the smells he finds in operations and sysadmin work. The thesis behind the Benelux session was that code smells offer a good vocabulary to help discuss code’s maintainability; and at first glance I think Patrick’s idea could start to do the same for system administration. Nice work Patrick!
During the last few weeks I’ve been participating in an email discussion about the relationship between static analysis tools (such as Reek) and TDD. The discussion was instigated by Pat Eyler, and he has now organised and posted the results on his On-Ruby blog.
To help me get an initial handle on the topic, I found it extremely useful to list the main areas of discomfort I feel when using Reek in my own work. Then for each of these (there were two) I constructed conflict clouds to get a balanced view of the problem. I don’t have the clouds to hand now, but you can read the results in Pat’s article. I’ll definitely be using that technique again, because it very quickly helped me to organise my thoughts. (Note to self: throw nothing away.)
Inspired by Uncle Bob’s use of crap4j, and egged on somewhat by various members of the Twitterverse, Marty Andrews and I have spiked crap4r on Github. This version looks for Rspec examples in all files called
spec/**/*_spec.rb; then it runs them using Rcov and calculates the CRAP metric using code from Roodi. Very neat.
It’s all a bit scrappy and difficult to use right now, but we’ve proved the concept, and over the next few weeks we’ll get it licked into shape and (hopefully) published as a gem.
I’ve just noticed that Reek is now one of the tools whose code quality reports are bundled in metric_fu — cool!
Chad Fowler has just launched Slick or Slack, a website where we can all review and compare code snippets for the Quality that has no name. Great idea — I’m hooked already!
Brent Strange reports that some major companies – notably Microsoft, Mozilla and VeriSign – have begun rewarding their testers with cash for finding serious defects prior to release. It seems to me that this approach is seriously flawed, in at least two respects.
First, it further promotes the traditional antagonism between developers and testers. There’s now a clear reward for testers to find the developers’ work wanting. How does that help to build trust or teamwork?
And second, it rewards the testers for not helping the developers get it right sooner. Sure, the cash will be less than the cost of releasing with a serious defect, but it will also be less than the cost of rework due to finding the defect late in the value stream.
The solution? Both developers and testers should be rewarded when the pre-release testing finds no defects.* Instead of rewarding antagonism, reward collaboration. And reward the reduction in rework. Have the testers engaged at the front of the value stream, creating automated self-checking tests that will help the developers get it right – and complete – first time.
* (Footnote: this needs to be balanced by penalties of some kind if the pre-release tests are skimped in any way!)