A couple of days ago I asked the Twitterverse to pick its favourite CI tool from four I had selected. Here are the results:
No real surprises there, I think. There was one additional vote for CircleCI, and a bit of a side discussion about make, but otherwise completely predictable.
Does this tell us anything, other than that I had 5 minutes with nothing to do on Friday evening? Maybe it suggests that very few people have used more than one of these tools? And maybe that’s due to very little cross-over between the .Net and Java universes? Who knows. Pointless.
A few days ago I noted that Shigeo Shingo, one of the founders of lean manufacturing, once said that testing to find problems is waste, whereas testing to prevent problems is not. Today I’ve been helping out with configuring an instance of CruiseControl, one that runs three or four projects and checks that no-one has broken any of them. This is testing after the fact, after check-in, after the developer has written his unit tests and his code. So is it waste?
I think it’s quite a close call, but on balance I’ll differ with James Shore and answer “no”. There’s no getting away from the fact that CruiseControl is reviewing code after the fact. After commit. We have to break our project before CruiseControl helps us out. And I wonder whether the presence of a good, automated tester downstream can sometimes engender a little too much comfort. I know I’ve been guilty of saying “Oh let’s just check this in — CruiseControl will soon tell us if we’ve made a mistake.”
But in CruiseControl’s favour, it certainly does catch the most frequent problem I’ve seen with (developers’ use of) version control: forgetting to commit a new source file. And by running a complete, clean build, it can throw up environmental or portability problems not seen when developers work for long periods in stable, cosy workstation setups. So it’s a fresh pair of eyes, doing a quick review of the project’s latest state. Possibly akin to automated checks of a sub-assembly in a lean manufacturing plant, it catches some problems early in the flow.
Maybe in the end it’s about the team. Installing CruiseControl is a great first step on the road to being “always release ready”, and definitely helps teams transition to agile development. And then after a while, I wonder whether there’s a level of team maturity at which it may no longer be needed?
Testing early for the first time is a real-life story from Mike Kelly, telling what happened in his group when they introduced testing during the normal development cycle.
“On schedule, several of the developers indicated their code was complete at a status meeting. I configured my local server to run the code and began my testing. Surprisingly (or perhaps not) this caused some problems.”
It showed up problems in the code, half-truths in project tracking and problems with configuration management. But as the initial shock wore off, and the developers became used to the idea, some powerful transformations occurred.
“We had automated hundreds of tests at the component level. We had validated many of the mapping documents. We had become active participants in design meetings (not all of them, but enough that we felt like we had a small victory). And by the end of our testing, we had developers (again not all of them) coming over to us and asking us to review their work. After the initial growing pains of the added visibility to the status of the code, most of the emotions died down, and we were able to release some great software.”
It’s a great story, and well worth a read. And please tell me about other stories like this, in which development groups learn to be agile by dipping a toe in the water…
I don’t think I had heard the phrase “going ugly” until I read Go Ugly Early by Dwayne Melancon today. To quote Dwayne:
“This concept involves releasing early iterations of your products so you can allow your customers to interact with them and provide feedback. I’m not talking about releasing unstable or buggy products – I’m talking about releasing stable products that have limited functionality, but which telegraph the shapes of things to come.”
Which, of course, is exactly the approach promoted by all of the agile software development methods. Sort of.
In fact, the agile community has amazingly little to say about beta programmes and the like. We talk a great deal about the OnsiteCustomer (aka ProductOwner or ChiefEngineer or ProductDirector) who gives rapid feedback to the developers on a daily basis. But that’s not the same as giving a try-out version to a few key users. If you’re developing something you would use yourself, then trying out your own dog food early is very easy and relatively risk-free. But when the target user is potentially a paying customer, it can be all too easy to perceive the risks as outweighing the advantages. What if they laugh at our half-baked ideas? What if they steal the idea and take it to our leading competitor? Frankly, I don’t believe any of these are real issues. Not compared to the benefits of using early-access free stuff to forge a long-term relationship.
The ProductDirector’s rapid feedback and direction is essential in the microcosm of the development shop itself. But in addition, giving genuine users the chance to genuinely use parts of your new product under genuine conditions can be … genuinely useful – both to you and to them. Your user (or potential user) is being given the opportunity to ensure that your next product fits them like a glove. Most will be prepared to invest a little time to get that. And if they laugh, you’ve learned something about your understanding of the market – and you’ve learned it without investing the whole kaboodle.
Of course, the partial system must be production quality. If it isn’t bug-free and trivial to install you will probably be dismissed out of hand. (Again you’ve learned something – but it was something you already knew and could have avoided.) So this isn’t prototyping, it is real product development, in small chunks. The opportunity only arises because of honest and thorough use of agile practices such as TDD, ContinuousIntegration, DailyBuild etc. (Maybe that’s where the perceived risks come from: this is all new. Waterfalling never gave us the chance to create this kind of relationship with our customers.)
Do you have a story where going ugly early saved your bacon? I’d love to hear it.
This week I was co-opted to act as a temporary project manager for three weeks while someone’s on holiday. I’ve spent much of the week shadowing the PM I’m replacing, and it’s now 4:15 on Friday afternoon. I’m packing up my laptop and beginning to think of fish and chips, when I see the said PM and the technical lead in conversation a few desks away. I have much to learn, so I can’t pass up on an opportunity to listen in and learn some more.
It turns out that they’re discussing Monday’s milestone, in which some parts of the project will be handed over to the live deployment team. As they check off the modules in this mini-release, it becomes apparent to them that there’s one missing! The technical lead knew, at the start of the design phase (yes, I know), that it was needed. But for some reason it had never made its way into the project manager’s plan. And the developers only worked on things that were in the plan. So it was never developed! Very honourably, the manager took the blame and set about finding a solution…
I’ve been wondering for a few hours now how such a blatant error could happen (I know it isn’t unique, and I surmise that problems of this nature probably cost this organisation millions annually). I’ve come to know these two people a little in recent days: the manager is proudly non-technical – it’s his job to schedule other people’s work, not to understand it; and the designer is proudly technical – it’s his job to design solutions, not to run projects. This corporation sees nothing wrong with that, and in fact selects people for precisely these characteristics. So the manager’s role was to create a plan from technical input he couldn’t understand; and the technician was not required to review that plan for correctness. And that’s part of the problem: the culture here is one of specialists in silos. In this case something fell between the cracks (in many ways it’s remarkable that they actually spotted it before live deployment).
But why did neither the manager nor the designer develop any interest in what the other was doing? Because another part of the reason for this failure is that each of them, indeed everyone here, is always concurrently working on two, three, even twenty projects! No-one has enough time to care about what they are producing.
Eli Goldratt would have a field day here…
Today I discovered that the corporate software development process here is iterative! Which means that there’s an “iterative” design/code phase, followed by an “iterative” integration/test phase… (This is where a sense of humour becomes a survival mechanism.)
My department is involved in the integration/test phase for every project. And in my quest to find ways to begin measuring throughput, naturally I asked how long a project will typically spend in that phase. It turns out that a rare few can get done in a couple of days, most require 2-6 weeks, and at least one project took over four months to successfully integrate.
I wonder if I’ve bitten off more than I can chew…
About four years ago, I was leading one of the teams in a 7-team software project. This time it was my turn to be the ‘integration manager’ for the next release. So very early in the release cycle I created a batch job that performed a nightly build of everything in the current codebase. And if anything had automated tests the batch job ran them too. I wanted to de-risk the release by finding most integration problems early, before ‘codefreeze’ at the end of the development phase.
The nightly build was a great success. Most nights the tests all passed, and only rarely did anyone check in code that broke the build itself. But the integration and system testing for this release was still a nightmare of panic fixes and hurried tests.
And a couple of releases later I worked out why: the existence of the nightly build had not changed anyone’s habits for the better. The other teams only ever checked in code on the last day of each release’s development phase, just in time for system testing! In fact I had made things worse than they were before: the fear of breaking the overnight build had actually caused many check-ins to be delayed until everything was “finished”. (And so our ClearCase repository crashed on the last day of every release’s development phase, due to overload.) Instead of providing feedback on development progress, the nightly build had actually reduced feedback and created more integration problems than we had before!
I’ve always advocated the daily build as one of the things that should be implemented first when moving a project towards agility. But I now realise that its success depends on a precondition: the development team (and its managers) must understand that it is possible to develop a system feature by feature. In the agile community we know all about the benefits of doing this, but I suggest that most software managers out there don’t even hold it as possible. Most of my peers on the project above thought it was impractical, or even laughable, to organise their work so that we had a working system at the end of each day. Why go to all the extra effort of finding pieces that could be added without breaking anything, and that were not dependent on something that was scheduled to be done sometime later…?
Update, 9 Sept 04
Mike Clark points out an interesting article in this month’s Better Software magazine on this very topic. There is a real difference between Continuous Integration and Daily Build, and the psychological impact of that difference will drive teams in quite different directions.