This is not my idea but since Joakim hasn’t blogged about it, I almost feel obliged to.
Some time ago Joakim, Tobias and myself where discussing and giving each other feedback on three upcoming lightning talks we wanted to present at the yearly Agile Sweden conference. Joakim had come up with a topic that had a lot to do with bugs, errors and mis-functionality that the users of a computer system has to deal with.
In his excellent talk Joakim ended up introducing a word ‘Felyta’ which roughly translates to ‘Error surface’ or ‘Error area’, The word doesn’t translate very well in my opinion but the concept is universal. I will try to explain it and then add to it.
The severity of errors
All old, new or changed functions that reach end users has the potential to either wreck havoc or add value and we often hope it’s the latter. If the software doesn’t act the way we want and causes us trouble we want that remedied quickly and the effects of the errors caused reverted equally fast.
Most users of software has at least one experience they remember where the software has gone crazy and caused a lot more grief and damage than the same manual process ever could have. We take working software for granted up until the point where the execution speed of software reminds us that it is really a double edged sword. The effects of error is something we want to minimize and we try very hard to reduce ‘being wrong’.
Take a look at the graph fig. 1. It represents the unwanted effect of the new release of some new sofware where an error was introduced and then remedied.
Different errors, different effects?
Very severe errors are usually taken care of fast because we hate being wrong and in a state of uncertainty. But what about those nagging mis-features, how are they handled? A common pattern are services releases, fixes that are lumped together in a batch fashion and delivered when they carry ‘enough value’.
The next picture, fig. 2, illustrates how a less severe but still unwanted effect adds up over time. From the illustration it is easy to see that the effect in the long run is pretty much the same as in the former case. The blue areas are roughly the same size in the two graphs and the only striking difference is in their orientation.
It’s the product that counts
Instead of looking at unwanted effects on one dimension — more or less severe, I suggest we start describing its effect as an area. They are a function of severity over time, not just a value. This model of how unwanted functionality effects our users and business is more powerful and closer to reality than just categorizing unwanted effects as more or less severe to have in production. It is the total area that counts, the error area or maybe we can call that product ‘Risk Area’.
If we agree that the area is more important than severity alone, we are also in a better position to generate a options that can reduce the effect of our future mistakes.
More on generating those options in the next post. ‘Patterns for reducing Risk Area’