TDD: three easy mistakes

I run introductory training in test-driven development quite frequently these days. And each time I do, I find the same basic mistakes cropping up every time, even among teams who already claim to practice TDD. Here are the three mistakes I see most often:

1. Starting with edge cases (empty string, null parameter) or error cases:

Imagine you have to test-drive the design of an object that can count the number of occurrences of every word in a string. I will often see some or all of these tests written before any others:

public class WordCounterTest {

  @Test
  public void wordCounterCanBeCreated() {
    assertNotNull(new WordCounter());
  }

  @Test(expected=IllegalArgumentException.class)
  public void nullInputCausesExeption() {
    new WordCounter().count(null);
  }

  @Test
  public void emptyInputGivesEmptyOutput() {
    Map<String, Integer> actual = new WordCounter().count("");
    assertEquals(new HashMap<String, Integer>(), actual);
  }

}

These feel easy to write, and give a definite feeling of progress. But that is all they give: a feeling of progress. These tests really only prove to ourselves that we can write a test.

When written first like this, they don’t deliver any business value, nor do they get us closer to validating the Product Owner’s assumptions. When we finally get around to showing him our work and asking “Is this what you wanted?”, these tests turn out to be waste if he says “Actually, now I see it, I think I want something different”.

And if the Product Owner decides to continue, that is the time for us to advise him that we have some edge cases to consider. Very often it will turn out to be much easier to cope with those edge cases now, after the happy path is done. Some of them may now already be dealt with “for free”, as it were, simply by the natural shape of the algorithm we test drove. Others may be easy to implement by adding a Decorator or modifying the code. Conversely, if we had started with the edge cases, chances are we had to work around them while we built the actual business value — and that will have slowed us down even more.

So start with tests that represent business value:

@Test
public void singleWordIsCounted() {
  Map<String, Integer> expected = new HashMap<String, Integer>();
  expected.put("happy", 2);
  assertEquals(expected, new WordCounter().count("happy happy"));
}

This way you will get to ask the Product Owner that vital question sooner, and he will invest less before he knows whether he wants to proceed. And you will have a simpler job to do, both while developing the happy path, and afterwards when you come to add the edge cases.

2. Writing tests for invented requirements:

You may think that your solution will decompose into certain pieces that do certain things, and so you begin by testing one of those and building upwards from there.

For example, in the case of the word counter we may reason along the following lines: “We know we’ll need to split the string into words, so let’s write a test to prove we can do that, before we continue to solve the more difficult problem”. And so we write this as our first test:

@Test
public void countWords() {
  assertEquals(2, new WordCounter().countWords("happy monday"));
}

No-one asked us to write a method that counts the words, so yet again we’re wasting the Product Owner’s time. Equally bad, we’ve invented a new requirement on our object’s API, and locked it in place with a regression test. If this test breaks some time in the future, how will someone looking at this code in a few months’ time cope with that: A test is failing, but how does he know that it’s only a scaffolding test, and should have been deleted long ago?

So start at the outside, by writing tests for things that your client or user actually asked for.

3. Writing a dozen lines of code in order to get the next test to pass:

When the bar is red and the path to green is long, TDD beginners often soldier on, writing an entire algorithm just to get one test to pass. This is highly risky, and also highly stressful. It is also not TDD.

Suppose you have these tests:

@Test
public void singleWordIsCounted() {
  assertEquals("happy=1", new WordCounter().counts("happy"));
}

@Test
public void repeatedWordIsCounted() {
  assertEquals("happy=2", new WordCounter().counts("happy happy"counts));
}

And suppose you have been writing the simplest possible thing that works, so your code looks like this:

public class WordCounter {
  public String counts(String text) {
    if (text.contains(" "))
      return "happy=2";
    return "happy=1";
  }
}

Now imagine you picked this as the next test:

@Test
public void differentWords() {
  assertEquals("happy=1 monday=1", new WordCounter().counts("happy monday"));
}

This is a huge leap from the current algorithm, as any attempt to code it up will demonstrate. Why? Well, the code duplicates the tests at this point (“happy” occurs as a fixed string in several places), so we probably forgot the REFACTOR step! It is time to remove the duplication before proceeding; if you can’t see it, try writing a new test that is “closer” to the current code:

@Test
public void differentSingleWordIsCounted() {
  assertEquals("monday=1", new WordCounter().counts("monday"));
}

We can now make this simpler set of tests pass easily, effectively by removing the duplication between the code and the tests:

public class WordCounter {
  public String counts(String text) {
    String[] words = text.split(" ");
    return words[0] + "=" + words.length;
  }
}

After making this relatively simple change, we have now test-driven part of the algorithm with which we struggled earlier. At this point we can try the previous test again; and this time if it is still too hard, we may wish to ask whether our chosen result format is helping or hindering…

So if you notice that you need to write or change more than 3-4 lines of code in order to get to green, STOP! Revert back to green. Now either refactor your code in the light of what just happened, so as to make that test easier to pass, or pick a test closer to your current behaviour and use the new test to force you to do that refactoring.

The step from red bar to green bar should be fast. If it isn’t, you’re writing code that is unlikely to be 100% tested, and which is prone to errors. Choose tests so that the steps are small, and make sure to refactor ALL of the duplication away before writing the next test, so that you don’t have to code around it whilst at the same time trying to get to green.

Estimating user stories: the 5 day challenge

This is a quick note about an idea I’ve been using with a few software teams during the last couple of years. I also spoke about it briefly at the Scottish Ruby Conference this week. If you try it, please publish your experiences and link to them via the comments here.

Tl;dr — don’t guess the size of a story; fit the story to the size you want.

So there’s this big discussion going on about #NoEstimates and how estimating is wasteful, misleading etc. But there are very few published practical alternatives. So what to do if you want to do less estimating? Honestly, I think that’s the wrong question to be asking. A more important question is, at least for every team I’ve encountered, “how can we become more predictable in what we will deliver?” Estimates only go so far in answering this question because, well, they’re just guesses. It doesn’t matter whether they are expressed as hours, pair-hours, story points, complexity points, Gummi bears or whatever — someone somewhere will attempt to do arithmetic with your estimates and thus turn them into “facts”. So estimating is hard, and it’s guesswork; it is not my intention here to grumble on about all the side-effects of that – you can read all of that stuff elsewhere. Instead, let’s cut to the chase and do something.

Estimates are risky and difficult. So let’s try the opposite. Instead of estimating the next story, let’s play to our strengths as developers and give ourselves a technical and analytical puzzle: Let’s fit the story to the estimate. Here’s how it works:

When the team picks up the next story, apply the “five day challenge”. First ask, “can we deliver this in 5 days?” If the answer is “yes”, just do it and then pick up the next story. But if the answer is “no”, have the whole team find some core nugget of useful value within the story such that everyone agrees it could be delivered in five days or less, and such that it will be a useful and valuable product increment. Then deliver that core nugget to your users, and go and pick the next story.The core of the story

This challenge process is fun, and it is exactly the kind of problem many developers and product owners are good at solving. Furthermore, there is good literature out there to help you do it and get good at doing it. That’s a win.

Shortly after you have split the story to fit into five days, the Product Owner should take each of the edge cases you peeled off, turn them into stories and push them back down the queue. One or two of them may be the next stories to be scheduled, while others may wind up never being picked. All that matters right now is that they aren’t essential to delivering the current story, and thus need occupy no more developer time this week.

Of course, if your stories are already small, instead of 5 days pick three, or two, or one. I like 5 days because it almost always gives time to deliver an interesting chunk of value, and for many teams it’s an improvement over their current practice. The important thing is to pick a number of days and stick to it until you are confident that you have that size of story nailed. Then try reducing it by one day and learn how to slice your stories even more thinly.

When you have delivered the story, record the actual number of days it took. If that differs from five days, take 5 minutes as a team and list the reasons for that variance. Use this to help you do a better job of fitting the next story to 5 days.

Note that this is NOT a challenge to the team, it’s a challenge to the story. You all sat down together and dug out a core nugget of useful value, so now go ahead and deliver that without taking any short cuts. If you find yourselves running late, don’t try to squeeze the story into your estimate. It doesn’t matter if the story ends up taking 11 days or just two. The important thing is to do the story well, and then learn from the actual time it took.

In practice, this technique dovetails with the whole ecosystem of the other XP practices. And it fits best of all with a few other team micro-habits that I have evolved during the last couple of years. Hopefully I will be writing about some of these here soon. Or you could hire me to help your team implement XP, and find the best fit for the XP principles and practices in your organisation :)

Maths challenge

My son was given this challenge for his school homework tonight:

Pick a number in the range 2-100. Next, pick a number that is a factor or a multiple of the first, again using only the numbers 2-100. Continue like this, building a chain of numbers in the range 2-100 in which each number is either a factor or a multiple of its predecessor in the chain. What is the longest non-repeating chain you can build?

I feel some gentle programming about to occur…

Reflections on a day of mob programming

Last week one of the teams I coach was given a day to build a proof of concept for a new business idea. I thought that #MobProgramming might be a good fit for the day’s activities, so here’s what happened.

There were five of us in the team for the day: three developers, one product owner and one coach/developer (me). The product owner had done some research into the topic we were to explore, but the rest of us were relative novices in the subject matter.

We grabbed a room with a large central table and two whiteboards for the day. We set up two laptops with a projector each, projecting onto the whiteboards. On one laptop we showed the software in the text editor, and on the other we showed the product we were building. To avoid spending too much time setting up fancy sharing schemes, we simply used a github repository to share code between the two laptops: every few minutes we would commit and push from the development laptop, and the product owner would pull and refresh on the product laptop to demonstrate and explore the current version of the product. This worked reasonably well, although occasionally we did launch the product on the development laptop too, for example when we wanted to check something in the Javascript console more quickly than the github cycle would allow.

We began by discussing the product owner’s explorations, and soon we agreed a goal for the day. Our intention was to develop a very simple but useful walking skeleton that would demonstrate a single use case in the problem domain. Throughout the day we regularly revisited our understanding of the goal and compared it to our progress thus far. By 3pm we had achieved our objective, so we spent a further hour or so refactoring and stabilizing our solution, with a view to ensuring that the code would be understandable if or when we came back to it in a week or so.

We developed outside in, in very thin increments. The product owner understood this approach quickly and intuitively, and was very good at guiding us to the next thin slice. The first slice was simply a static “Hello, world!” web page, but it proved our “delivery” pipeline between the two laptops. With each further slice all of us, including the product owner, learnt things that changed our minds about what to develop next.

Occasionally the team got a little carried away and began to write code that the product owner hadn’t asked for, or that was more general than he wanted for the current slice. But as the day wore on the team became better and better at turning his requirements around quickly by doing the (often surprisingly minimal) minimum. I heard myself saying “I don’t think we need to do that” and “Commit!” at frequent intervals throughout the day.

All five of us stayed extremely focussed for the whole exercise, which lasted 7 hours with a half-hour break for lunch. At the end, all of us agreed that we were spent, and that we had greatly enjoyed both mob programming and solving the problem we had set ourselves.

I do, however, have a couple of open questions:

  • Would this have worked with a larger group? Five people felt just about right, and we all remained engaged throughout. Would that have been true for a group of six or seven, or would some team members have found their attention drifting?
  • How did the subject matter contribute? We all remained engaged for the whole day, and I wonder how much that was a consequence of working on a difficult new greenfield problem. Would the same effect have occurred had we worked on the team’s usual legacy codebase?
  • One member of the team was off sick that day. I wonder how she will feel now that everyone else has a deep and shared understanding of the prototype’s design and implementation?

Microhabits

I’ve recently been doing some coaching at the BBC in Media City, Salford. During the first couple of weeks my commute home always took 90 minutes; then on maybe one day per week I got home in 65 minutes; now it usually takes 65 minutes, and only occasionally (maybe one day per week) takes 90 minutes. I have saved 25 minutes on my usual journey time, and I have done it without resorting to extreme measures such as (heaven forfend!) spending money or running. How? By developing a few “micro-habits” — little routines that each save a tiny amount of time, but which together add up to a huge saving of almost 30%. I believe this approach can be applied to everything we do, so I think it may be instructive to dig into that 25-minute saving to see how it is achieved…

First, let’s take a look at the typical 90-minute commute home:

  1. At 1600, pack away my things, leave the office and walk to the tram stop.
  2. Wait for the next tram, board it, and wait for it to move off.
  3. The tram journey takes 20-25 minutes to reach Piccadilly station, depending on traffic conditions in Manchester.
  4. Walk up to the main concourse in the station and to the departures board.
  5. Wait 20 seconds for the board to show trains to Macclesfield. The next train is the 1648, departing from platform 3.
  6. Walk to platform 3, stopping to find and show my ticket to the inspector at the gate.
  7. Wait for the train. The typical wait at this point is 10-12 minutes.
  8. Sit reading on the train until it arrives in Macclesfield at approximately 1718 (actually the usual arrival time is 1723).
  9. Walk 0.5km to my car, then drive 3km home. The drive takes 5-10 minutes, depending on traffic conditions.

I noticed that there is an express train leaving Manchester Piccadilly at 1635 and arriving in Macclesfield at 1655; I also noticed that I usually arrived in Piccadilly station at around 1638. So it would only require me to save 3-4 minutes in the first half of my journey in order to realise a 25-minute saving overall. Note that there are plenty of delays in this process. I have no control over some of these, such as random traffic hold-ups, or the actual arrival time of the next tram. But I do have control over some, and during the course of a couple of weeks I have developed the following micro-habits:

  1. During the afternoon, as I use things for the last time I put them away in my locker. This is a gamble, but most of the time it turns out to be okay. And it saves me upto 2 minutes of packing up time at 1600, which means I have slightly greater a chance of catching an earlier tram.
  2. Trams are scheduled every 12 minutes, so the expected mean wait time at any stop is 6 minutes. If there is no tram waiting (or imminently arriving) at Media City, I walk to the next stop down the line. That’s because Harbour City is on TWO tram lines, and so more trams pass through that stop. The walk takes 3 minutes, and always pays off: either I get the next tram out of Media City, thus saving nothing and losing nothing, or I get an earlier tram on the Eccles line, thus removing upto 3 minutes waiting from my journey.
  3. The rearmost door of the tram is always closest to the stairs when it pulls in to Piccadilly; so I always position myself close to that door, to minimise the time it takes to get up to the main concourse.
  4. I now ASSUME that I will be in time to catch the 1635, and I ASSUME that it will depart from platform 6 as usual. So I don’t go to the departure display and wait for it to show me current status; instead, I go directly to platform 6, a saving of 20-30 seconds.
  5. While walking to the tram I found my ticket and made sure it was in an easily accessible pocket. When I arrive on platform 6 I can now retrieve the ticket instantly and show it to the inspector without breaking step, thus saving 20 seconds.

As the departure of the 1635 is usually imminent at this point, these last two savings of upto 40 seconds can mean the difference between catching it and having to wait 13 minutes for the 1648. If there were no major delays on the tram I will now get on the 1635, which arrives in Macclesfield at 1655 (always). The remainder of my commute is unchanged, and so I arrive home at around 1705 — a total commute of 65 minutes instead of 90.

Most people will, I think, do similar things to these when faced with a regular commute such as mine. As I mentioned above, I believe this approach can be applied to everything we do, and in particular I think there are huge savings to be made in software development. I believe that many, if not most, software teams could significantly reduce cycle times — the time taken to pull a single product improvement through from “idea” to “in use” — by developing micro-habits to eliminate waste. I believe that the most productive teams, and the most productive developers, already do this; and I am sure that these micro-habits make the best upto ten times as productive as the rest.

So what are the micro-habits that will make this huge difference for software development? Some of them are universal and coarse-grained:

  • use a refactoring tool,
  • use a version-control tool,
  • become test-infected,
  • write black-box behavioural tests at all levels,
  • pair program,
  • refactor mercilessly,
  • work outside-in, etc.

But I suspect that there are more that are specific to the developer, or to the team, or to the codebase, or to the technology being used. Some of mine are:

  • learn the keyboard shortcuts for all of your tools,
  • drink a lot of water,
  • pick pieces of work that can be committed to trunk at least every hour,
  • re-run the tests after every distraction,
  • commit all work before going home or throw it away,
  • automate tmux set-up,
  • automate towards a single-click build,
  • automate towards a single-click deploy, etc etc.

Do you have any stories showing how a few micro-habits have transformed a team’s productivity? There seems to be no written repository collecting such developer micro-habits together, and nor is there any obvious forum in which they are explored, shared and discussed. That seems a shame, and a huge waste. What should we as a community do about it?

[Credit to Sean Blezard for coining the term “micro-habit” when I explained to him the idea for this blog post.]

On conference formats

Matt Wynne just blogged about conferences that have a high proportion of curated content, and how that can seem to create an undesirable rock-star culture in which very few of the attendees participate actively. I didn’t attend the conference that sparked Matt’s thoughts, but I do tend to agree with his sentiments. Here are a few random responses of my own:

  • Stage-managed is much less engaging and enjoyable than spontaneous. This is one of the reasons I’m no longer involved with AgileNorth.
  • I like BoFs (Birds of a Feather sessions), which are more like round table discussions than presenter-led talks.
  • I like SPA‘s randomly-selected buddy groups, in which a group of strangers meet occasionally to share their experiences of the conference so far.
  • I loved the hacking on raffle in response to Matt’s call-to-arms at the Scottish Ruby Conference in 2012.
  • The lightning talks were one of the highlights of this year’s #scotruby for me.
  • I was uncomfortable at SPA the year Kent Beck came to talk about XP, because the whole 3 days became the Kent circus; he was accompanied everywhere by 50 sycophants.
  • After Jim Weirich‘s rspec-given talk this year I wanted to stand up and say “This excites me so much that later I will be re-coding the event_bus specs in this style — who wants to pair?”, but there wasn’t the time. I wish every conf session would end with pairing challenges such as that.

Hearing new ideas from speakers is great; but discussing ideas and learning from friends and strangers is much greater. In my opinion.

The trouble with passwords

Recently I’ve been using other people’s computers a lot more often, because I’m working away from my home office most of the time. And because of this I noticed a serious flaw in the strategy I had used for choosing passwords: they were all the same! I had chosen a very strong password that was unguessable and yet memorable to me, but I had used it everywhere. So if anyone did ever guess it I would be screwed.

Time to change to a new strategy, I thought. So I looked at lastpass, 1password et al. But again, it seemed to me that I would ultimately be putting my entire faith in a single password (and a single supplier), which felt almost as unsafe as having the same password everywhere. So what to do? I need passwords that are strong, easy for me to remember, and different on every service I use. Oh, and they have to meet all the different and daft criteria that the various websites impose; some insist on at least 8 characters, some on at least 9; some insist on at least one digit; others require a capital letter; etc etc.

The only scheme I know of that meets all of these criteria is to use a memorable 2-3 word phrase, together with a keyword indicating the site holding the account. So I wrote a random phrase generator, and discovered that every ten refreshes or so I was pretty much assured of generating a phrase that stuck in my brain. (Unfortunately you then also need to sprinkle the password with numbers and punctuation, not to make it stronger but to meet those daft website rules.)

For example, I just ran the generator and got bloody tomato. So I might then use the different passwords Blo0dy-tomato+mates for facebook, Blo0dy-tomato+pic for instagram, etc. As long as the scheme for adding the site-specific key is personal and memorable this should be a better password strategy than I had before.

Feel free to use the generator yourself (but obviously I can’t be held responsible if your passwords are hacked). The current words lists can generate over 750 trillion different passphrases, and I’m adding more words all the time.

Query actions in Rails controllers

Recently some of my controller actions have taken on a definite new shape. Particularly when the action is a read-only query of the app’s state. Such actions tend to make up the bulk of my apps, and they can be simple because they are unlikely to “fail” in any predictable way. Here’s an example from my wiki app:

This has a couple of significant plus points: First, no instance variables, so both the controller action and the view are more readable and easier to refactor. Second, no instance variables! So there’s a clear, documented, named interface between the controller and the view. And third, this is so transparently readable that I never bother to test it.

The wiki used in the above action is a repository, built in a memoizing helper method that most of the controllers use:

In this case the correct kind of repository is created for the current user, and all of the other code in this request sits on top of that. So the controller action, helped by the memoized repository builder, effectively constructs an entire hexagonal architecture for each request, and the domain logic is thus blissfully unaware of its context.

Here’s a slightly bigger example. This is for a page that shows a variety of informative graphs about the wiki; and because I may want to re-organise my admin pages in the future, each graph’s data is built independently of the others:

That’s the most complex query controller action I have, and I maintain that it’s so simple I don’t need to test it. Would you?

Why shorter methods are better

TL;DR
Longer methods are more likely to need to change when the application changes.

The Longer Story
Shorter methods protect my investment in the code, in a number of ways.

First: Testability. I expect there to be a correlation between method length (number of “statements”) and the number of (unit) test cases required to validate the method completely. In fact, as method size increases I expect the number of test cases to increase faster than linearly, in general. Thus, breaking a large method into smaller independent methods means I can test parts of my application’s behaviour independently of each other, and probably with fewer tests. All of which means I will have to invest less in creating and running the tests for a group of small methods, compared to the investment required in a large method.

Second: Cohesion. I expect there to be a strong correlation between method length and the number of unknown future requirements demanding that method to change. Thus, the investment I make in developing and testing a small method will be repaid countless times, because I will have fewer reasons in the future to break it open and undo what I did. Smaller methods are more likely to “stabilize out” and drop out of the application’s overall churn.

Third: Coupling (DRY). I expect that longer methods are more likely to duplicate the knowledge or logic found elsewhere in the same codebase. Which means, again, that the effort I invest in getting this (long) method correct could be lost if ever any of those duplicated pieces of knowledge has to change.

And finally: Understandability. A method that does very little is likely to require fewer mental gymnastics to fully understand, compared to a longer method. I am less likely to need to invest significant amounts of time on each occasion I encounter it.

The Open-Closed Principle says that we should endeavour to design our applications such that new features can be added without the need to change existing code — because that state protects the investment we’ve made in that existing, working code, and removes the risk of breaking it if we were to open it up and change it. Maybe it’s also worth thinking of the OCP as applying equally to methods.

A testing strategy

The blog post Cucumber and Full Stack Testing by @tooky sparked a very interesting Twitter conversation, during the course of which I realised I had fairly clear views on what tests to write for a web application. Assuming an intention to create (or at least work towards creating) a hexagonal architecture, here are the tests I would ideally aim to have at some point:

  • A couple of end-to-end tests that hit the UI and the database, to prove that we have at least one configuration in which those parts join up. These only need to be run rarely, say on CI and maybe after each deployment.
  • An integration test for each adapter, proving that the adapter meets its contract with the domain objects AND that it works correctly with whatever external service it is adapting. This applies to the views-and-controllers pairings too, with the service objects in the middle hexagon stubbed or mocked as appropriate. These will each need to run when the adapters or the external services change, which should be infrequent once initial development of an adapter has settled out.
  • Unit tests for each object in the middle hexagon, in which commands issued to other objects are mocked (see @sandimetz‘s testing rules, for which I have no public link). And for every mocked interaction, a contract test proving that the mocked object really would respond as per the mocked interaction (see @jbrains‘s Integrated tests are a scam articles). These will be extremely fast, and will be run every few seconds during the TDD cycle.

I’ve never yet reached this goal, but that’s what I’m striving for when I create tests. It seems perfectly adequate to me, given sufficient discipline around the creation of the contract tests. Have I missed anything? Would it give you confidence in your app?