testing early for the first time
May 27, 2006
Testing early for the first time is a real-life story from Mike Kelly, telling what happened in his group when they introduced testing during the normal development cycle.
“On schedule, several of the developers indicated their code was complete at a status meeting. I configured my local server to run the code and began my testing. Surprisingly (or perhaps not) this caused some problems.”
It showed up problems in the code, half-truths in project tracking and problems with configuration management. But as the initial shock wore off, and the developers became used to the idea, some powerful transformations occurred.
“We had automated hundreds of tests at the component level. We had validated many of the mapping documents. We had become active participants in design meetings (not all of them, but enough that we felt like we had a small victory). And by the end of our testing, we had developers (again not all of them) coming over to us and asking us to review their work. After the initial growing pains of the added visibility to the status of the code, most of the emotions died down, and we were able to release some great software.”
It’s a great story, and well worth a read. And please tell me about other stories like this, in which development groups learn to be agile by dipping a toe in the water…
bug bounties
May 19, 2006
Brent Strange reports that some major companies - notably Microsoft, Mozilla and VeriSign - have begun rewarding their testers with cash for finding serious defects prior to release. It seems to me that this approach is seriously flawed, in at least two respects.
First, it further promotes the traditional antagonism between developers and testers. There’s now a clear reward for testers to find the developers’ work wanting. How does that help to build trust or teamwork?
And second, it rewards the testers for not helping the developers get it right sooner. Sure, the cash will be less than the cost of releasing with a serious defect, but it will also be less than the cost of rework due to finding the defect late in the value stream.
The solution? Both developers and testers should be rewarded when the pre-release testing finds no defects.* Instead of rewarding antagonism, reward collaboration. And reward the reduction in rework. Have the testers engaged at the front of the value stream, creating automated self-checking tests that will help the developers get it right - and complete - first time.
* (Footnote: this needs to be balanced by penalties of some kind if the pre-release tests are skimped in any way!)
use tests as a failsafe
January 25, 2006
Yesterday I wrote about tolerating the red bar, when a few scary tests fail only occasionally. And today it strikes me that one of the contributors to the persistence of this situation is our tool-set.
As I said, we run the tests, get a red bar (sometimes), and then open up the test list to check that only the expected tests have red dots against them. If we see nothing untoward we just get on with the next task. But of course we aren’t really looking at the reason for the failure. Perhaps the tool itself is making it too easy for us. Or perhaps we interpret those dots too liberally?
So here’s the thought: What would life be like if our tools actually prevented check-ins while there are any failing tests? This would effectively “stop the line” until the problem was sorted out. And it would force us to address each problem while that part of the codebase was fresh in our minds.
I also suspect that peer pressure (”hey, I can’t check anything in now!”) might quickly cause us develop a culture in which we tried to eradicate the root causes of test failures. Instead of relying on CruiseControl to “deodorise” our stinky practices…
(If you’ve tried this I’d love to hear your experiences. Drop me a line.)
open quality - revisited
November 10, 2005
In Open Quality Recognized Mark DeVisser adds to my discussion of Agitar’s Open Quality initiative. Has anyone else out there taken up the challenge?
open quality
September 8, 2005
Today on the Yahoo XP list, Kent Beck posted this link to Agitar’s open quality initiative. I applaud their openness, and would definitely encourage all other development groups to follow suit. (There’s a small danger, of course, that publication of such “dashboards” can be manipulated for the purposes of chest-thumping. I’m sure that isn’t the case with Agitar.)
It seems to me that the mere act of putting together the dashboard publication scheme would provide a group with important insights and impetus. And being able to “compare” numbers across the community offers both security (”Phew! most teams are as bad at UI testing as us”) and challenges (”Blimey, most folks test over 95% of their classes”). Perhaps every group that publishes a dashboard page should make it easy to Google - maybe we could agree on standard phrases to include on the page…? (with a link from the C2 wiki to the Google search, so that the standard is enshrined in a working implementation)
Update, 10 nov 05
Agitar’s Mark DeVisser has commented on this post.
a second pair of eyes
August 11, 2005
I’ve just been working with a team which has a pairing policy: every item of code must have been seen by two pairs of eyes before it can be checked in. It doesn’t work.
The effect of the policy is to replace pair programming - instead developers do a “pair check-in” at the end of each development episode. So a developer will beaver away working on a feature for a day or so, getting it right, making it work, passing all the tests. And then he’ll call over to another team member to request a “pair check-in”. The other team member comes to the developer’s station and is walked through the changes in the version control tool. And then the code is checked in and the two team members part company again.
The problem here is that the process sets the two people up to be in opposition: the developer is effectively asking for approval, instead of asking for help. It’s natural for the developer to feel a sense of ownership, because he’s worked hard to get that code complete and correct. Not many people can graciously accept negative feedback after all that hard work.
It can also be hard for the reviewer - the “second pair of eyes” - to come up to speed quickly enough. The developer knows these changes intimately, but the reviewer is being asked to understand them cold. He has little chance of being effective in that situation.
So this process has all of the demerits of Inspections, with none of the advantages. The team would be more effective adopting true pair programming, I feel.
better tester, worse code
July 26, 2005
I’ve recently been observing a couple of very similar development teams who had one major difference: The tester in Team 1 was very good at his job, whereas the tester in Team 2 wasn’t. And as a result, the developers in Team 1 produced significantly poorer code than those in Team 2! It turns out that the very good tester was highly trusted by the rest of his team - so much so that they were happy to delegate complete responsibility for product quality to him. In turn, this freed them to churn out code at an alarming rate, without regard to whether it worked particularly well.
Team 2 ended up hardly using their tester, preferring to rely on TDD to catch most defects before they made them. But they were able to release product at the drop of a hat, because they knew and trusted the quality of their code at all times. On the other hand, Team 1 required over a week of full-team manual testing and defect fixing before they were prepared to believe they were ready to release.
Team 1 were applauded for their speed of coding, and for the obviously great work of their tester. Defects? Rework? Manual tests? They’re a fact of life in software development aren’t they? Just an overhead we have to live with. But look how fast we go!
By comparison, Team 2 were castigated for their slowness. They did very little fire-fighting. Releases were a non-event, an anti-climax almost. Unnoticed, unheralded, they produced working product on a weekly basis.
I’m sure none of this is a revelation to you, but to see it in action is quite impressive.
what is “quality”?
February 3, 2005
Just what is software quality? I hear talk of software being of “high” or “low” quality, as if there is one Quality that something can possess to varying degrees. Then I see Specifications that break that Quality down into Maintainability, Supportability etc. Each of these is a boolean attribute than is either present or not - high Quality is the possession of all of these “ilities”.
Back in the days of TQM I recall the mantra “quality is fitness for purpose.” At the time this struck us fresh graduates as a radical and deep thought, but it now seems merely a more bullish version of Gerry Weinberg’s “quality is value to some person.” Of course, every person - every stakeholder - will assign different values to a software system. A developer will want different things than a user, or the user’s manager, or the support engineer. I guess we’re back to the “ilities” - an expression of how the design must take account of each stakeholder’s needs.
Read the rest of this entry »
fix everything except scope
December 9, 2004
I’ve just read a process guideline for managers who are unfamiliar with iterative development. It states that each iteration must have a detailed plan, and that in creating that plan the manager will be making “constant trade-offs between scope, quality and schedule”. (The whole document seems to be written with the intention of making iterative development seem more difficult than waterfall! Is there a subtle agenda, attempting to steer managers down familiar paths?)
The advice seems reasonable at first glance. After all, if we’re behind in adding the key feature for this iteration, we’ve always had the options of cutting it out, or not testing it, or slipping the date - right? I don’t think so. We should always reduce the scope of the iteration.
Read the rest of this entry »
discontinuous integration
October 8, 2004
Today I discovered that the corporate software development process here is iterative! Which means that there’s an “iterative” design/code phase, followed by an “iterative” integration/test phase… (This is where a sense of humour becomes a survival mechanism.)
My department is involved in the integration/test phase for every project. And in my quest to find ways to begin measuring throughput, naturally I asked how long a project will typically spend in that phase. It turns out that a rare few can get done in a couple of days, most require 2-6 weeks, and at least one project took over four months to successfully integrate.
I wonder if I’ve bitten off more than I can chew…






