geekup nottingham

Tomorrow evening (Nov 1st, 2010) I’ll be speaking at GeekUp Nottingham, reprising my flow session from this year’s AgileNorth conference. And at some point in the evening I’ll be giving away a signed copy of my book Refactoring in Ruby.

If I show slides tomorrow, it’ll be the set I showed at AgileNorth, plus a couple I’ve added just in case there’s no flipchart.

Looking forward to a great night, and to fascinating discussion as always.

rocks into gold

My very talented friend Clarke Ching has self-published his second novel Rocks into Gold (his first, Rolling Rocks Downhill, is due out later this year).

Rocks into Gold is a “parable for software developers who want to survive — and then thrive — through the Credit Crunch”. If you’re a subscriber to this blog you’ll probably know the book’s main message already; but read it anyway, because you probably also know someone who’s project, or job, or organisation, might just benefit from Clarke’s excellent re-telling — buy them a copy.

inherent simplicity

This week I’ve been doing a lot of reading around Goldratt’s latest big idea: inherent simplicity. The premise is that any complex system (Goldratt typically considers companies here) always has an “inherent simplicity”. Once found, this simple model of the complex organism can be used to reason about it and create breakthrough analyses of, say, the route to exponential growth in profits.

The more I read, the more I realised that this idea has formed the basis of much of my career. Having a PhD in mathematics, I have always looked for — and enjoyed looking for — the elegant and simple solution at the heart of any problem. And in my work life I’ve applied the same aesthetic to solving business problems. Here’s a neat example from a consulting gig in the 1990’s…

A group of us (designers, developers, business analysts) had been tasked with re-engineering an old suite of applications used by a parcel delivery firm. The idea was to replace a bunch of disparate applications with a single, distributed enterprise solution, and at the same time evolve from “parcel tracking” to “parcel management” (whatever that means). We had a 200-page spec describing the business rules for parcel management, and it was unbelievably complex. A parcel might have a barcode, or a hand-written label, or no legible markings at all. It might be waiting to be loaded on a van to its final destination, or waiting to be sorted at the hub, or in the wrong place entirely. If it was in the wrong place, it may have been labelled correctly or incorrectly, legibly or illegibly. And so on. We were becoming bogged down in the detail, and everywhere we looked there was more. The object model for the new application was looking like spaghetti, and the mess got larger each day.

So one day, frustrated by the complexity, I took a break from facilitating the analysis sessions and took the problem home; peace and quiet. And I found the system’s inherent simplicity: Look at everything from the parcel’s point of view. Where I’ve been doesn’t matter; whether I’m in the right or wrong place now doesn’t matter. All that matters is where I should go next. And for each parcel, that’s unique. Given any parcel located anywhere in the system, there’s a “best” route to its destination, and therefore a single next place to which it should be moved. The means of transport (hand, van, plane, forklift, …) is also unimportant at this level. All of the business rules lived inside Parcel.get_next_location(), and everything else was implementation detail.

It took the other members of the analysis team a couple of days to grasp this simplicity and peel away the layers of complexity. And then the project was canned, so this elegant solution was never implemented.

Anyroadup, you get the idea…

downstream testing implies a policy constraint

As usual, it takes me multiple attempts to figure out what I really want to say, and how to express myself. Here’s a bit more discussion of what I believe is implied by downstream testing:

The very fact that downstream testing occurs, and is heavily consuming resources, means that management haven’t understood that such activity is waste. (If management had understand that, then they would re-organise the process and put the testing up front — prevention of defects, instead of detection.) No amount of tinkering with analysis or development will alter that management perception, and therefore the process will always be wasteful and low in quality. So the constraint to progress is management’s belief that downstream testing has value.

can downstream testing ever be the bottleneck?

Everywhere I go I find managers complaining that some team or other is short of staff; and (so far) that has always turned out to be a mirage.

TOC’s 5 Focusing Steps say that adding resources to the bottleneck is the last thing one should do. Before that, a much more cost-effective step is to “exploit” the bottleneck — ie. to try to ensure that bottleneck resources are only employed in adding value. So in the case where testing is the bottleneck, perhaps one should begin by ensuring that testers only work on high quality software; because testing something that will be rejected is waste.

And from the Lean Manufacturing camp, Shigeo Shingo (I think) said something along the lines of “testing to find defects is waste; testing to prevent defects is value”. Which seems to imply that waterfall-style testing after development is (almost) always waste.

Which in turn implies (to me at least) that testing in a waterfall process can never be the bottleneck. The bottleneck must be the policy that put those testers at that point in the flow. Does that sound reasonable to those of you who know a lot more about this kind of stuff than I do?

why YAGNI acts to EXPLOIT the bottleneck

Clarke asked me to explain my earlier throw-away remark that YAGNI forms part of the EXPLOIT step in moving the bottleneck away from development, so here goes…

YAGNI (You Aren’t Gonna Need It) is an exhortation from the early days of XP. It has been discussed and misunderstood a great deal, so I’m not going to get into the finesses of meaning here. For our purposes, it reminds the developer not to work on features or generalisations that may be needed, telling him instead to focus his present efforts on delivering only what he knows is the current requirement. (In the interests of brevity, I’ll refer below to YAGNI only in terms of added behaviour, and I’ll use the word “feature” for any fragment of any kind of behaviour; all other forms of YAGNI are assumed.)

(In my practice I use a similarly attention-grabbing soundbite. Whenever I see a developer do something “because it may be needed in the future” I accuse him of crystal ball gazing. I remind the whole team that it can be risky and dangerous to get your balls out, and that seems to help the message stick. Other times there’s an embarrassed silence.)

Writing crystal ball code has three effects: In the present moment, it means that the developer is spending current time investing in one of many possible futures; in the period from now until that possible future, it means that there is code in the system that doesn’t need to be there; and when the future arrives, it may look different than that which the developer predicted.

First, then, crystal ball code uses up current development time. This is bad when development is the bottleneck and when batch sizes are relatively small and when development order has been defined in terms of business value and when feature cycle time is a KPI. The time spent developing a crystal ball feature will delay the current batch and all batches upto the imagined future. There is a tiny chance that development of that future batch will be faster (see below), but all interim ROI (for example) will be reduced by the delay introduced right now.

Second, the crystal ball code represents inventory, and it has a carrying cost. This code, which may never be required by the end user, must always build, integrate and pass all tests; if ever it doesn’t, time must be spent fixing it. Furthermore, a larger codebase will always require more time and effort to understand and navigate (think of having to drive around piles of inventory in order to fetch anything or the lean practice of 5S). Even if the guess turns out to be correct, the additional carrying cost of this inventory will slow down the development of all batches of features between now and the imagined future.

Third, the developer’s guess may be just plain wrong. Either the imagined “requirement” is never requested, or it is requested and by that time the codebase is radically different from what it is now. The developer may have to spend time removing the feature (for instance if it would confuse or endanger the user) or completely re-design it to make it match how reality turned out. It is assumed that the “wow, that’s exactly what we needed” outcome is sufficiently unlikely that the costs of the other outcomes dominate.

So YAGNI is based on a few core assumptions:

  • The product is to be built incrementally in batches of features
  • Each increment should be potentially shippable in terms of quality and cohesiveness
  • It is hard to predict what features will be requested in later batches
  • It is hard to predict what future code may look like
  • Development is the bottleneck
  • Speed of development is crucial
  • The present value of current features is higher than the future value of future features

Under these conditions, YAGNI is part of the EXPLOIT step because it helps to maximise the amount of current development effort going into delivering current value.

TOC and YAGNI

My apologies if this has been said or written a thousand times before: YAGNI is XP’s way of exploiting the constraint.

Which means that XP, and hence most agile methods, are set up on the assumption that the development team is – or soon will be – the bottleneck. And having identified that bottleneck, our first task is to EXPLOIT it – that is, we make sure that the bottleneck resource only works on stuff that contributes to overall throughput. YAGNI and test-driven development do that. Oh, and a relentless pursuit of quality, so that the bottleneck doesn’t have to spend time on rework. And effective communication, so we get to spend more time developing and less time writing documents and attending meetings. And tight feedback loops, so that we can identify which stuff is valuable and which isn’t.

Next we must SUBORDINATE the whole flow to the pace of the bottleneck. Fixed-length iterations help us to measure that pace, and the various forms of planning game and iteration commitment help to prevent work arriving too fast.

And only when all of that is working well is it safe to ELEVATE the constraint, perhaps by expanding the size of the team. I’m fairly sure I’ve never seen a real-life case in which this step was required. For most of my engagements, successfully exploiting the bottleneck caused the constraint to move elsewhere; and in the rest, the SUBORDINATE step revealed a deeper constraint elsewhere in the organisation.

insurance for software defects

The more I think about it, the more astonished I become. Maintenance contracts for (bespoke) software: Buying insurance to cover against the possibility that the software doesn’t work.

I know the consumer electronics industry does the same, and I always baulk at the idea of spending an extra fifty quid in case the manufacturer cocked up. I wonder what percentage of purchasers buy the insurance? And I wonder what percentage of goods are sent back for repairs? Perhaps the price could be increased by 10% and all defects fixed for free. Or perhaps the manufacturer could invest a little in defect prevention.

It seems to me that software maintenance contracts are an addiction. Software houses undercut each other to win bids, and then rely on the insurance deal to claw back some profits. So no-one is incentivised to improve their performance, and in fact the set-up works directly against software quality. Perhaps it’s time now to break that addiction…

If a software house were able to offer to fix all defects for free, would that give them enough of a market advantage to pay for the investment needed to prevent those defects? Is “zero defects or we fix it for free” a viable vision? (Does any software house offer that already?) And how many software companies would have to do it before the buyer’s expectations evolved to match?

As an industry, do we now know enough to enable a few software houses to compete on the basis of quality?