I have argued that in order to measure the habitability of a codebase, we should focus on attempting to measure how well the code conforms to the Once And Only Once rule. So what does that rule mean? It means that every important concept in the domain is represented (named) explicitly — not buried inside another concept (Once), and not spread out in the space between several other concepts (Only Once).
How can we write a tool that will detect when a concept is buried inside another? Well, let’s flip that question around: How can we tell when a concept contains another buried inside it? I’m not sure that pure size is a reliable guide, because design concepts have no standard size. Some will require only a few lines of code, while others will need dozens. But it is probably not unreasonable to assume that a file which is significantly larger than the others in a codebase is probably harbouring too much responsibility.
What about complexity? It is usually true that a class with many conditionals or many levels of indentation is asking to be simplified. And very often the best response to over-use of branching is to break out the Strategy pattern, which gives explicit names to two or more of the responsibilities hidden inside the original class.
Lopez et al, in Relevance of the Cyclomatic Complexity Threshold for the Java Programming Language [pdf], propose that the McCabe complexity of object-oriented software is lower (less than half) that of procedural software. While their study does not appear to take the degree of habitability into account, it seems suggestive to me that some measure of branching might be a reasonable reflection of whether a class is pregnant with unnamed concepts.
Efferent coupling [pdf] may also be indicative here. It seems intuitively correct that a class that has to import/require many others is probably doing too many things.
How could we write a tool that will detect when a concept is spread out in the space between two or more others? Well, again the presence of branches may be an indication of Connascence of Meaning (think of the Null Check and Special Case code smells). Primitive Obsession is also a symptom of Connascence of Meaning, but I suspect we could only detect that automatically in statically typed languages (without actually running the code).
Again, efferent coupling may be indicative. When two classes both import/require another, there may be some Connascence of Algorithm going on — or at least a missing abstraction wrapping the imported stuff.
So while I have no evidence, my instincts tell me that we may be able to get a decent hint at habitability by finding some workable combination of coupling and branching measures. Soon I will begin exploring these ideas with code samples in a variety of languages, to see whether I can work up a metric (or a system of metrics) that can predict whether I will like the code…