[Gluster-infra] Progress report for regression tests in Rackspace

Thu May 15 18:26:46 UTC 2014

On 05/15/2014 07:57 PM, Niels de Vos wrote:
> On Thu, May 15, 2014 at 06:05:00PM +0530, Vijay Bellur wrote:
>> On 04/30/2014 07:03 PM, Justin Clift wrote:
>>> Hi us,
>>>
>>> Was trying out the GlusterFS regression tests in Rackspace VMs last
>>> night for each of the release-3.4, release-3.5, and master branches.
>>>
>>> The regression test is just a run of "run-tests.sh", from a git
>>> checkout of the appropriate branch.
>>>
>>> The good news is we're adding a lot of testing code with each release:
>>>
>>>   * release-3.4 -  6303 lines  (~30 mins to run test)
>>>   * release-3.5 -  9776 lines  (~85 mins to run test)
>>>   * master      - 11660 lines  (~90 mins to run test)
>>>
>>> (lines counted using:
>>>   $ find tests -type f -iname "*.t" -exec cat {} >> a \;; wc -l a; rm -f a)
>>>
>>> The bad news is the tests only "kind of" pass now.  I say kind of because
>>> although the regression run *can* pass for each of these branch's, it's
>>> inconsistent. :(
>>>
>>> Results from testing overnight:
>>>
>>>   * release-3.4 - 20 runs - 17 PASS, 3 FAIL. 85% success.
>>>     * bug-857330/normal.t failed in one run
>>>     * bug-887098-gmount-crash.t failed in one run
>>>     * bug-857330/normal.t failed in one run
>>>
>>>   * release-3.5 - 20 runs, 18 PASS, 2 FAIL. 90% success.
>>>     * bug-857330/xml.t failed in one run
>>>     * bug-1004744.t failed in another run (same vm for both failures)
>>>
>>>   * master - 20 runs, 6 PASS, 14 FAIL. 30% success.
>>>     * bug-1070734.t failed in one run
>>>     * bug-1087198.t & bug-860663.t failed in one run (same vm as bug-1070734.t failure above)
>>>     * bug-1087198.t & bug-857330/normal.t failed in one run (new vm, a subsequent run on same vm passed)
>>>     * bug-1087198.t & bug-948686.t failed in one run (new vm)
>>>     * bug-1070734.t & bug-1087198.t failed in one run (new vm)
>>>     * bug-860663.t failed in one run
>>>     * bug-1023974.t & bug-1087198.t & bug-948686.t failed in one run (new vm)
>>>     * bug-1004744.t & bug-1023974.t & bug-1087198.t & bug-948686.t failed in one run (new vm)
>>>     * bug-948686.t failed in one run (new vm)
>>>     * bug-1070734.t failed in one run (new vm)
>>>     * bug-1023974.t failed in one run (new vm)
>>>     * bug-1087198.t & bug-948686.t failed in one run (new vm)
>>>     * bug-1070734.t failed in one run (new vm)
>>>     * bug-1087198.t failed in one run (new vm)
>>>
>>> The occasional failing tests aren't completely random, suggesting
>>> something is going on.  Possible race conditions maybe? (no idea).
>>>
>>>   * 8 failures - bug-1087198.t
>>>   * 5 failures - bug-948686.t
>>>   * 4 failures - bug-1070734.t
>>>   * 3 failures - bug-1023974.t
>>>   * 3 failures - bug-857330/normal.t
>>>   * 2 failures - bug-860663.t
>>>   * 2 failures - bug-1004744.t
>>>   * 1 failures - bug-857330/xml.t
>>>   * 1 failures - bug-887098-gmount-crash.t
>>>
>>> Anyone have suggestions on how to make this work reliably?
>>
>>
>>
>> I think it would be a good idea to arrive at a list of test cases that
>> are failing at random and assign owners to address them (default owner
>> being the submitter of the test case). In addition to these, I have
>> also seen tests like bd.t and xml.t fail pretty regularly.
>>
>> Justin - can we publish a consolidated list of regression tests that
>> fail and owners for them on an etherpad or similar?
>>
>> Fixing these test cases will enable us to bring in more jenkins
>> instances for parallel regression runs etc. and will also provide more
>> determinism for our regression tests. Your help to address the
>> regression test suite problems will be greatly appreciated!
>
> Indeed, getting the regression tests stable seems like a blocker before
> we can move to a scalable Jenkins solution. Unfortunately, it may not be
> trivial to debug these test cases... Any suggestion on capturing useful
> data that helps in figuring out why the test cases don't pass?
>

To start with, obtaining the logs and cores from a failed regression run 
(/d/logs/...) of build.gluster.org would be useful. Once we start 
debugging a few problems and notice the necessity for more information, 
we can start collecting them for a failed regression run.

-Vijay