Friday, August 15, 2014

Sporadics don't matter

In my daily work I quite frequently stumble over the phenomenon of - as we call it - Sporadics. Sporadics are tests that fail every now and then for a bunch of reasons. Other great blog posts call them intermittent test failures (http://spin.atomicobject.com/2012/04/27/intermittent-test-failures/) or non-deterministic failing tests (http://martinfowler.com/articles/nonDeterminism.html).

However the phenomenon bothers me a lot these days. But it’s not so much the Sporadic itself but the attitude towards them. Every time a Sporadic occurs I hear sentences like:

“This is just a Sporadic you can merge into mainline anyway”
“You can’t get rid of Sporadics, you have to deal with them statistically”
“It’s too expensive to fix them”
“It’s infrastructure anyway”
“There are too many of them to tackle them. Just ignore.”

In general people (aka developers) tend to consider them as a minor issue or as a matter of fate you can’t cope with anyway. That leads to the impression that Sporadics just exist but they do not tell you anything about the quality of your product just like the “real” test failures do. Real test failures signal a broken feature that has to be fixed or a changed one that makes the tests fail for expectations changed. One could fix that. One could even write a ticket for and resolve it in due time. Everything is nice and comparatively easy when it comes to “real” test failures.

However, over the last years I came to the impression that the “real” test failures which always get much attention and awareness are not the ones I should be afraid of but that the Sporadics are really the ticking bomb. How come?

As said before for some reason developers tend to avoid analyzing Sporadics. Probably this hasn’t been always the case but comes out of experience. Often enough the root cause of the failure turned out to be some problem with the test infrastructure or a bug in the test itself. So, no real problem with the application has been found. Somehow this notion has been settled in. Sporadic failure means infrastructure or test issue means no bug means ignore the red test and merge into mainline.

But what happens underneath? Is this all there is? What about the application under test showing a non-deterministic behavior?

These things happen (own experience). And they are what gives me the creeps. Non-deterministic behavior of the application yields strange behavior at customer site which is very hard to analyze and thus very expensive to fix. And who’s to blame? If Sporadics are considered unimportant or even irrelevant no one will fix them. They keep piling up. And somewhere in this pile there these little bombs will hide. No one will stumble over these issues. They just keep creeping into the mainline by and by.

That’s why I consider Sporadics as the most important test failures out there. If a Sporadic occurs this has to be analyzed very fast. If you are lucky it really is an infrastructure or test design issue. But nevertheless go and fix them. Keep the Sporadics away from your test suites. Otherwise the information the tests could provide you with will decrease and at some point in time vanish into nothing.

In this series of blog posts I want to explore the phenomenon of Sporadics. I want to find out what they really are and what to do about them. Each Sporadic tells a story about your test environment, your test design and your application design. Each Sporadic failure serves as a signal to take action. Let’s see what one could do about them. I hope you will join me on this journey. 


The opinions expressed in this blog are my own views and not those of SAP

No comments:

Post a Comment