[Gluster-devel] good job on fixing heavy hitters in spurious regressions

Pranith Kumar Karampuri pkarampu at redhat.com
Fri May 8 03:15:32 UTC 2015


hi,
        I think we fixed quite a few heavy hitters in the past week and 
reasonable number of regression runs are passing which is a good sign. 
Most of the new heavy hitters in regression failures seem to be code 
problems in quota/afr/ec, not sure about tier.t (Need to get more info 
about arbiter.t, read-subvol.t etc). Do you guys have any ideas in 
keeping the regression failures under control?

Here are some of the things that I can think of:
0) Maintainers should also maintain tests that are in their component.
1) If you guys see a spurious failure that is not seen before, please 
add it to https://public.pad.fsfe.org/p/gluster-spurious-failures and 
send a mail on gluster-devel with relevant info. CC component owner.
2) If the same test fails on different patches more than 'x' number of 
times we should do something drastic. Let us decide on 'x' and what the 
drastic measure is.
3) tests that fail with less amount of information should at least be 
fixed with adding more info to the test or improving logs in the code so 
that when it happens next time we have more information. Other option is 
to enable DEBUG logs, I am not a big fan of this because when users 
report problems also we should have just enough information to debug the 
problem, and users are not going to enable DEBUG logs.


Some good things I found this time around compared to 3.6.0 release:
1) Failing the regression on first failure is helping locating the 
failure logs really fast
2) More people chipped in fixing the tests that are not at all their 
responsibility, which is always great to see.

I think we should remove "if it is a known bad test treat it as success" 
code in some time and never add it again in future.

Pranith


More information about the Gluster-devel mailing list