[Gluster-devel] How to cope with spurious regression failures
Atin Mukherjee
amukherj at redhat.com
Tue Jan 19 14:32:42 UTC 2016
On 01/19/2016 07:08 PM, Raghavendra Talur wrote:
>
>
> On Tue, Jan 19, 2016 at 5:21 PM, Atin Mukherjee <amukherj at redhat.com
> <mailto:amukherj at redhat.com>> wrote:
>
>
>
> On 01/19/2016 10:45 AM, Emmanuel Dreyfus wrote:
> > Hi
> >
> > Spurious regression failures make developers frustrated. One submits a
> > change and gets completely unrelated failures. The only way out is to
> > retrigger regression until it passes, a boring and time-wasting task.
> > Sometimes after 4 or 5 failed runs, the submitter realize there is a
> > real issue and look at it, which is a waste of time and resources.
> >
> > The fact that we run regression on multiple platforms makes the
> > situation worse. If you have 10% of chances to hit a spurious
> failure on
> > Linux and a 20% chances to hit a spurious failure on NetBSD (random
> > number chosen), that means you get roughtly a failure for four
> > submissions (random prediction, as I used random input numbers,
> but you
> > get the idea)
> >
> > Two solutions are proposed:
> >
> > 1) do not run unreliable tests, as proposed by Raghavendra Talur:
> > http://review.gluster.org/13173
> >
> > I have nothing against the idea, but I voted down the change
> because it
> > fails to address the need for different test blacklists on different
> > platforms: we do not have the same unreliable tests on Linux and
> NetBSD.
>
>
> Why I prefer having this solution:
> a. Allowing re-running to tests to make them pass leads to complacency
> with how tests are written.
> b. A test is bad if it is not deterministic and running a bad test has
> *no* value. We are wasting time even if the test runs for a few seconds.
IMHO, most of our tests are non-deterministic and that's why my vote
would be for option 2 over 1 as that reduces the probability of retriggers.
> c. I propose another method to overcome the technical difficulty of
> having blacklists for different platforms. We could have "[K[a-z]*-]*"
> as prefix of tests where [a-z]* could be L or N signify that the test is
> bad on Linux and NetBSD respectively. The run-tests.sh script can be
> made intelligent enough to determine host OS and skip them.
>
>
>
> >
> > 2) add a regression option to retry a failed test once, and to
> validate
> > the regression if second attempt passes, as I proposed:
> > http://review.gluster.org/13245
> >
> > The idea is basicaly to automatically do what every submitter has been
> > doing: retry without a thought when regression fails. The benefit of
> > this approach is also that it gives us a better view of what test
> failed
> > because of the change, and what test failed because it was unreliable.
> >
> > The retry feature is optionnal and triggered by using the -r flag to
> > run-tests.sh. I intend to use it on NetBSD regression to reduce the
> > number of failures that annoy people. It could be used on Linux
> > regression too, though I do not plan to touch that on my own.
> +1 to option 2
> >
> > Please people tell us what approach you prefer.
> >
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
More information about the Gluster-devel
mailing list