why i broke 89 tests

A couple of days ago I made a little change to Reek and to my surprise 89 tests broke, which is bad. I tweeted about it, and soon @logosity had asked me to write about why it had happened. I know the same sort of thing happens to people all the time, and it can be quite traumatic in the otherwise placid flow of test-driven development. So I’m hoping it will be somewhat helpful to uncover the path that led me from blissful ignorance to test calamity with one keystroke. So without further ado, let’s do some root cause analysis…

Undesirable event:
I deleted an attr_reader that was called in only one place and 89 of Reek’s 360-odd rspec examples broke. Of these broken tests, only three were tests of SimulatedPolymorphism#examine_context, which was the deleted attribute’s one caller.

Why did the other 86 tests break unexpectedly?

Because they each indirectly invoked the one method that now could no longer work.

Why did they invoke the broken method?

Because they are integration tests, checking the co-ordinated behaviour of a dozen or so classes.

The broken method sits on a relatively new class, an instance of which is automatically included in the framework that those tests invoke.

The central class in Reek is the Sniffer, which obtains a syntax tree from a Source (eg. a File or a String or an IO stream) and then traverses it, firing events at registered SmellDetector objects as it encounters syntax elements they subscribe to. The broken method is in one of these listeners, and it is fired whenever the parser encounters a new class definition in the syntax tree.

Reek has a couple of hundred tests that each run a Sniffer over a code snippet, checking that the detectors find a specific set of code smells (and don’t find any others). The broken detector (SimulatedPolymorphism) is fired on any code fragment that includes a class definition; it turns out that only 86 of these couple hundred tests do that. The damage could have been much worse.

All of which means that the failing tests are “integration tests”, checking the overall behaviour of over a dozen classes under a variety of specific circumstances. Under some circumstances one of those classes is broken, so some of these integration tests fail.

Why are there so many integration tests in the unit test suite?

Because they are much easier to write than the tests for a single class would be. And they express the desired behaviour in ways that unit tests can’t. So I never took the time to write the unit tests.

Here’s an example of one of these tests:

describe ControlCouple do
  it 'should report a ternary check on a parameter' do
    code = 'def simple(arga) arga ? @ivar : 3 end'
    code.should reek_only_of(:ControlCouple, /arga/)
  end
end

The corresponding unit tests would confirm that:

  • the code parser and method context objects conspire together to record the method’s parameters
  • the code parser sends an event to the control couple detector when, and only when, it sees a conditional expression
  • the control couple detector can obtain the enclosing method’s parameters
  • the control couple detector records a smell warning only when the conditional expression tests the value of a method parameter
  • the smell warning includes the correct information
  • no other detector records any warnings for the given code fragment

Unit tests exist for some of these cases and not for others. The ease with which I can write readable tests has made me lazy: I often don’t write the unit tests, because I get new features into Reek much more quickly this way. The penalty occurs when a change breaks one of the detectors or (worse) something in the Sniffer, because then I get dozens of broken tests and it can be difficult to pinpoint the problem quickly.

There’s also another reason I got lazy: As Reek’s internal architecture stabilized many months ago, most of the new code I write consists of adding new smell detector classes, which usually slot right into the existing stable framework with few surprises or problems.

So the root cause of my 89 test failures is the combination of a well-tested, stable, practically defect-free framework together with a great way to quickly write readable integration test examples. Is that so bad after all, I wonder.

Am I going to fix the problem? Yes and no. Each time this happens in future I will add the missing unit tests that should have failed. But I don’t want to lose the expressive value of these integration tests, so I plan to (gradually) organise them into a different test suite more closely related to Reek’s cucumber tests. What would you do?

today’s kata

In Ruby, one easy way to check for palindromes would be to write a method such as this:


class String
 def palindrome?
   self == self.reverse
 end
end

But the C++ part of my brain screams that this is likely to be very inefficient. So, today’s kata: test-drive a palindrome? method for String, ensuring (with tests) that it is much faster than the above.

cucumber feature organisation

A few months ago I asked around to find out how (or whether) people organised the Cucumber feature files in their project’s features folder. Nothing in particular turned up. But yesterday the good folks at Thoughtbot posted their own very interesting convention: They have a sub-folder of features/ for each Actor served by the application. For example:

features/admins/
features/api_clients/
features/users/
features/visitors/

I like this a lot, and I’ll be re-organising Reek’s features soon along the same lines. In fact, thinking about this has made me realise that there’s a whole class of user (API clients) not represented by Reek‘s current set of Cucumber features.

And here’s an interesting thought experiment I’ll also be trying out: How about adding another folder level for the Actor’s goal? So the folders would be organised according to the commonalities among the feature descriptions: features/<as_a>/<in_order_to>/feature. For example:

features/
  administrator/
    manage_products/
      create_catalogue.feature
      retire_product_line.feature
      ...
    manage_users/
      grant_access.feature
      ...

Will that be useful, or just create more clutter? In general I dislike hierarchical classification, so I was keen to keep the sub-folder names as verbs. I’ll try it for Reek and report my feelings back here. Please let us know if you try it first…

factory method in ruby

During my refactoring homework last evening I noticed a little tug-of-war between two different coding styles, and after a restless night I’ll try to analyse here what was going on…

Deep inside Reek is a Source class, whose instances are responsible for converting Ruby code into abstract syntax trees for later examination by the various code smell detectors. Source code can come from many different places, and so Source has a number of Factory Methods like this:

class Source
  def self.from_io(ios)
    new(ios.readlines.join)
  end

  def self.from_s(code)
    new(code)
  end

  def self.from_f(file)
    from_path(file.path)
  end

  def self.from_path(filename)
    new(IO.readlines(filename).join)
  end

  # ...
end

(The factory methods also do some other stuff, which I’ve elided for clarity.)

As the refactoring exercise proceeded, I found that one of these methods came to be called from only one place. So I considered inlining it, and that’s when the tug-of-war began. Part of me resisted inlining the method, even though that’s what the code seemed to want. I argued that the Factory Methods were “potentially useful”, that they represented the Source abstraction and helped to document it; the code pulled the opposite way. (My wife was sitting across the room, so I kept this argument quiet inside my head; maybe that’s part of my problem.) Anyway, to cut a long story short, at the end of my refactoring session the code had changed to (something very similar to) this:

class File
  def to_source
    Reek::Source.new(self.path)
  end
end

class IO
  def to_source
    Reek::Source.new(self.readlines.join)
  end
end

class String
  def to_source
    Reek::Source.new(self)
  end
end

The Gang of Four didn’t name this pattern, so I’ll call these methods “Converters”. (Please tell me if this pattern has a more well-known name.) They’re just like Factory Methods, but they sit on the class being converted from instead of on the class being converted to.

The argument between me and the code was between the forces in favour of each pattern. In favour of the original Factory Methods:

  • In C++/Java — the world in which the GoF did their work, and where my roots lie — this is the only place to put the methods;
  • the dependencies point from Source to its origins (File, String, IO et al), which seems quite natural;
  • the compiler checks that the Factory Methods are called with the right types of input object;
  • all the ways to create a Source object are together, so they are easy to find and compare.

In favour of Converters:

  • We could later introduce new ways to create a Source without opening up that class;
  • they are instance methods, and hence just a little more natural, testable and “object-oriented”;
  • methods such as to_s, to_a, to_i are idiomatic in Ruby, which enhances the Communication value of this approach;
  • you need a File (for example) in order to call the Converter. (Actually, I’m not sure this counts as an advantage, because it prevents duck typing.)

On balance, I feel the C++/Java form of Factory Method has many advantages. But I switched the code to use the Converter form, mainly to see what would happen. I could easily switch back again now I’ve seen the forces laid out for comparison in this blog post…

reek mailing list

Discussion about Reek seems to be popping up all over the place at the moment, so I’ve created a Google group where problems, ideas and suggestions can be shared in one place. I’ll be making release announcements there and seeking feedback from you on my ideas for Reek’s future.

So if you’re using Reek, hop over to http://groups.google.com/group/ruby-reek and let’s get the ball rolling.

discussion on TDD and code analysis tools

During the last few weeks I’ve been participating in an email discussion about the relationship between static analysis tools (such as Reek) and TDD. The discussion was instigated by Pat Eyler, and he has now organised and posted the results on his On-Ruby blog.

To help me get an initial handle on the topic, I found it extremely useful to list the main areas of discomfort I feel when using Reek in my own work. Then for each of these (there were two) I constructed conflict clouds to get a balanced view of the problem. I don’t have the clouds to hand now, but you can read the results in Pat’s article. I’ll definitely be using that technique again, because it very quickly helped me to organise my thoughts. (Note to self: throw nothing away.)

crap4r

Inspired by Uncle Bob’s use of crap4j, and egged on somewhat by various members of the Twitterverse, Marty Andrews and I have spiked crap4r on Github. This version looks for Rspec examples in all files called spec/**/*_spec.rb; then it runs them using Rcov and calculates the CRAP metric using code from Roodi. Very neat.

It’s all a bit scrappy and difficult to use right now, but we’ve proved the concept, and over the next few weeks we’ll get it licked into shape and (hopefully) published as a gem.

i’ve moved my code

My website, kevinrutherford.co.uk used to be implemented using the Drupal CMS. I did that because I wanted to host various code samples and suchlike, but in the end it was rubbish. So this week I killed it and replaced it with the much simpler 1-page site you can see there now. As a consequence I’ve moved all of my old code samples to repositories on github. All much easier for me to maintain.

The one downside to all this is that I no longer have somewhere to host my simulation of coin tossing. Well, that’s a shame, but not the end of the world. If you want to try it you can clone the github repository; or you can simply look at the three sample outputs I’ve posted here, here and here. Not a live simulation, I know, but possibly the next best thing for now.