I have a dilemma. In fact, I frequently have this dilemma, so I thought I’d ask your opinion on which way you would address it.

I just completed a CRC session for the next feature I want to add to my codebase. One of the classes identified for the new scenarios is a particular kind of counting collector, and I realised that the same behaviour already occurs elsewhere in the codebase. So one of the first steps in coding up this new scenario will be to extract a new class from an existing class. The existing class has reasonable micro-tests, and therein lies the dilemma:

  • I could simply use the Extract Class refactoring recipe, and I’d be done quite quickly; the new collector class would be tested indirectly, via the micro-tests for the original class (which would now be the extracted class’s only client).
  • Or I could develop the collecting class test-first, from scratch, so that it had its own micro-tests, and then use Substitute Algorithm, reimplementing the old class to use the new.

The advantage of the second approach is that it gives me precise micro-tests of the new class. But while developing that class, I could drift away from the required usage, or I could make it harder to switch the original class over to using the new one. (I tend to think of Substitute Algorithm as being riskier than Extract Class.)

So which is better — use Extract Class and know that I have a working client-supplier relationship; or TDD from scratch and risk going off track? In practice, I tend to choose Extract Class if I’m feeling lazy or distracted, whereas I go with TDD followed by Substitute Algorithm when I’m feeling organized; in the latter case I usually get stressed by the “substitute” step…

(It seems to me that if we choose the latter approach in general, many refactorings should cease to be used. That can’t be right, can it?)

After a conversation this week with Hugh Sasse about Feature Envy and Utility Functions, I realised I’ve evolved my approach to dealing with them:

When Reek tells me about one of these smells, my first step is to inline the smelly method back into all of its callers. Then I look for ways to fix the resulting duplication without extracting the same method again. Sometimes the re-formed methods break apart in new ways (compared to when they were first written), other times I see opportunities to peel off new classes. (One thing I never do these days is to introduce inheritance relationships in order to remove duplication; I look hard for ways to solve it through delegation, often to a new class, or I leave the duplication in place for the time being.)

This approach seems to be working well for me at this early stage. Have you tried this or something similar? How did it turn out?

Ths ten-minute video chat between Corey Haines and J.B.Rainsberger introduced a nice simplification of eXtremeNormalForm.

In the discussion, JB hardens up the wishy-washy Communicates Intent value by noting that it’s just about “bad names”, and here’s why. In an OO language, “communicating intent” boils down to breaking the system into small pieces and giving them good names, names that resonate with the design and the domain. Small pieces are no good on their own, and good names for bad ideas will soon be weeded out. JB therefore claims that good design (he says good architecture) boils down to

  1. eliminate duplication
  2. eliminate bad names

(He takes passing all tests – ie. correctness – as a given, and he says that eliminating duplication and bad names also renders the software “small”. I disagree on that last point.) So, his assertion is equivalent to saying that every code smell is a symptom of either Duplication or Bad Names, which I find easier to explain than asking people to ensure that the code “communicates intent”.

I predict that 2010 will be the year that eXtreme Programming returns to centre stage.

Why? Because I believe that XP is what happens when you combine a “single-piece flow” kanban-style agile process with a deep attention to quality and TDD. And because I think CxOs are likely to hear “kanban” more favourably than they hear “extreme”.

operations smells

In response to my Developing a Sense of Smell sesion at XPDays Benelux this week, Patrick Debois has begun collecting the smells he finds in operations and sysadmin work. The thesis behind the Benelux session was that code smells offer a good vocabulary to help discuss code’s maintainability; and at first glance I think Patrick’s idea could start to do the same for system administration. Nice work Patrick!

interview with eli goldratt

Clarke Ching has achieved a major coup by posting a 1-hour interview with Eli Goldratt. In the interview Goldratt talks about his new book Isn’t It Obvious, which is another business novel about the Theory of Constraints, this time applied to retail. It’s a great book, and a very good interview — good job Clarke.

Bill Wake and I have been interviewed about Refactoring in Ruby by Russ Olsen for InformIT; you can read the interview here. (I note that my answers have been re-spelt the American way :)

Interesting link:
In Improving Code using Metric Fu the folks at devver.net give a little insight into how they have been using Reek and the other Ruby quality tools to improve their codebase.

why i broke 89 tests

A couple of days ago I made a little change to Reek and to my surprise 89 tests broke, which is bad. I tweeted about it, and soon @logosity had asked me to write about why it had happened. I know the same sort of thing happens to people all the time, and it can be quite traumatic in the otherwise placid flow of test-driven development. So I’m hoping it will be somewhat helpful to uncover the path that led me from blissful ignorance to test calamity with one keystroke. So without further ado, let’s do some root cause analysis…

Undesirable event:
I deleted an attr_reader that was called in only one place and 89 of Reek’s 360-odd rspec examples broke. Of these broken tests, only three were tests of SimulatedPolymorphism#examine_context, which was the deleted attribute’s one caller.

Why did the other 86 tests break unexpectedly?

Because they each indirectly invoked the one method that now could no longer work.

Why did they invoke the broken method?

Because they are integration tests, checking the co-ordinated behaviour of a dozen or so classes.

The broken method sits on a relatively new class, an instance of which is automatically included in the framework that those tests invoke.

The central class in Reek is the Sniffer, which obtains a syntax tree from a Source (eg. a File or a String or an IO stream) and then traverses it, firing events at registered SmellDetector objects as it encounters syntax elements they subscribe to. The broken method is in one of these listeners, and it is fired whenever the parser encounters a new class definition in the syntax tree.

Reek has a couple of hundred tests that each run a Sniffer over a code snippet, checking that the detectors find a specific set of code smells (and don’t find any others). The broken detector (SimulatedPolymorphism) is fired on any code fragment that includes a class definition; it turns out that only 86 of these couple hundred tests do that. The damage could have been much worse.

All of which means that the failing tests are “integration tests”, checking the overall behaviour of over a dozen classes under a variety of specific circumstances. Under some circumstances one of those classes is broken, so some of these integration tests fail.

Why are there so many integration tests in the unit test suite?

Because they are much easier to write than the tests for a single class would be. And they express the desired behaviour in ways that unit tests can’t. So I never took the time to write the unit tests.

Here’s an example of one of these tests:

describe ControlCouple do
  it 'should report a ternary check on a parameter' do
    code = 'def simple(arga) arga ? @ivar : 3 end'
    code.should reek_only_of(:ControlCouple, /arga/)
  end
end

The corresponding unit tests would confirm that:

  • the code parser and method context objects conspire together to record the method’s parameters
  • the code parser sends an event to the control couple detector when, and only when, it sees a conditional expression
  • the control couple detector can obtain the enclosing method’s parameters
  • the control couple detector records a smell warning only when the conditional expression tests the value of a method parameter
  • the smell warning includes the correct information
  • no other detector records any warnings for the given code fragment

Unit tests exist for some of these cases and not for others. The ease with which I can write readable tests has made me lazy: I often don’t write the unit tests, because I get new features into Reek much more quickly this way. The penalty occurs when a change breaks one of the detectors or (worse) something in the Sniffer, because then I get dozens of broken tests and it can be difficult to pinpoint the problem quickly.

There’s also another reason I got lazy: As Reek’s internal architecture stabilized many months ago, most of the new code I write consists of adding new smell detector classes, which usually slot right into the existing stable framework with few surprises or problems.

So the root cause of my 89 test failures is the combination of a well-tested, stable, practically defect-free framework together with a great way to quickly write readable integration test examples. Is that so bad after all, I wonder.

Am I going to fix the problem? Yes and no. Each time this happens in future I will add the missing unit tests that should have failed. But I don’t want to lose the expressive value of these integration tests, so I plan to (gradually) organise them into a different test suite more closely related to Reek’s cucumber tests. What would you do?

today’s kata

In Ruby, one easy way to check for palindromes would be to write a method such as this:


class String
 def palindrome?
   self == self.reverse
 end
end

But the C++ part of my brain screams that this is likely to be very inefficient. So, today’s kata: test-drive a palindrome? method for String, ensuring (with tests) that it is much faster than the above.

Next Page »