Thomas McCabe’s 1976 paper A Complexity Measure [pdf] suggests that we can “measure” the complexity of a program by counting the branch points. Later authors “show” that McCabe’s number correlates strongly with program size, and that it is therefore worthless. Still later authors [pdf] show that McCabe complexity is much lower (typically halved) for programs written in object-oriented languages. And although they don’t describe or assess the “quality” of their code samples, this latest “result” does suggest that complexity might be a useful tool in evaluating the habitability of a program.
(The ironic quote marks are there as a shorthand, to indicate that I question the science in these papers. In particular, no account was taken of design or coding style. In terms of habitability, though, the last study does offer a little chink of light.)
However, something irks me about the complexity measuring tools I’ve tried: they all work at the method level.
McCabe’s work describes program complexity. Not method (or function) complexity. His reasoning, if it is valid at all, is valid only for the entire call graph of a program. Such a thing is difficult to calculate in programs written using modern languages, due to late binding, polymorphism and/or dynamic typing, and may even be impossible without dynamic analysis. Indeed, a running program is likely to have different call graphs in different test scenarios, rendering the cyclomatic complexity of the entire program a moot point. So it is quite natural to re-cast the measure in terms of methods, purely in order to get something potentially useful.
But here’s the rub: how should we then roll those method complexities up so that we have an idea of the complexity of bigger things, such as classes or programs? I contend that we can’t, meaningfully.
Note that each method has at least one path through it, but two methods do not necessarily mean there are two code paths. Suppose I have this Ruby method, which has cyclomatic complexity 1:
base_price * 1.20
I may wish to extract a method to wrap the tax value:
base_price * tax_rate
Now I have two methods, each of which has “cyclomatic” complexity 1. The total complexity appears to have increased, and yet the program’s call graph is no more complex. Thus any tool that sums method complexities will penalise the ExtractMethod refactoring, because the tool is now using invalid graph theory.
True cyclomatic complexity is meaningful only (if at all) in the context of the call graph of the entire program. Anything that counts branch points for individual methods (and then adds 1) is not calculating “cyclomatic” complexity, and runs a high risk of penalising useful refactorings.