[Gluster-devel] [Gluster-infra] Progress report for regression tests in Rackspace

Thu May 15 18:30:54 UTC 2014

On 05/15/2014 09:08 PM, Luis Pabon wrote:
> Should we create bugs for each of these, and divide-and-conquer?

That could be of help. First level of consolidation done (with frequency 
of test failures) by Justin might be a good list to start with. If we 
observe more failures as part of ongoing regression runs, let us open 
new bugs and have them cleaned up.

-Vijay

>
> - Luis
>
> On 05/15/2014 10:27 AM, Niels de Vos wrote:
>> On Thu, May 15, 2014 at 06:05:00PM +0530, Vijay Bellur wrote:
>>> On 04/30/2014 07:03 PM, Justin Clift wrote:
>>>> Hi us,
>>>>
>>>> Was trying out the GlusterFS regression tests in Rackspace VMs last
>>>> night for each of the release-3.4, release-3.5, and master branches.
>>>>
>>>> The regression test is just a run of "run-tests.sh", from a git
>>>> checkout of the appropriate branch.
>>>>
>>>> The good news is we're adding a lot of testing code with each release:
>>>>
>>>>   * release-3.4 -  6303 lines  (~30 mins to run test)
>>>>   * release-3.5 -  9776 lines  (~85 mins to run test)
>>>>   * master      - 11660 lines  (~90 mins to run test)
>>>>
>>>> (lines counted using:
>>>>   $ find tests -type f -iname "*.t" -exec cat {} >> a \;; wc -l a;
>>>> rm -f a)
>>>>
>>>> The bad news is the tests only "kind of" pass now.  I say kind of
>>>> because
>>>> although the regression run *can* pass for each of these branch's, it's
>>>> inconsistent. :(
>>>>
>>>> Results from testing overnight:
>>>>
>>>>   * release-3.4 - 20 runs - 17 PASS, 3 FAIL. 85% success.
>>>>     * bug-857330/normal.t failed in one run
>>>>     * bug-887098-gmount-crash.t failed in one run
>>>>     * bug-857330/normal.t failed in one run
>>>>
>>>>   * release-3.5 - 20 runs, 18 PASS, 2 FAIL. 90% success.
>>>>     * bug-857330/xml.t failed in one run
>>>>     * bug-1004744.t failed in another run (same vm for both failures)
>>>>
>>>>   * master - 20 runs, 6 PASS, 14 FAIL. 30% success.
>>>>     * bug-1070734.t failed in one run
>>>>     * bug-1087198.t & bug-860663.t failed in one run (same vm as
>>>> bug-1070734.t failure above)
>>>>     * bug-1087198.t & bug-857330/normal.t failed in one run (new vm,
>>>> a subsequent run on same vm passed)
>>>>     * bug-1087198.t & bug-948686.t failed in one run (new vm)
>>>>     * bug-1070734.t & bug-1087198.t failed in one run (new vm)
>>>>     * bug-860663.t failed in one run
>>>>     * bug-1023974.t & bug-1087198.t & bug-948686.t failed in one run
>>>> (new vm)
>>>>     * bug-1004744.t & bug-1023974.t & bug-1087198.t & bug-948686.t
>>>> failed in one run (new vm)
>>>>     * bug-948686.t failed in one run (new vm)
>>>>     * bug-1070734.t failed in one run (new vm)
>>>>     * bug-1023974.t failed in one run (new vm)
>>>>     * bug-1087198.t & bug-948686.t failed in one run (new vm)
>>>>     * bug-1070734.t failed in one run (new vm)
>>>>     * bug-1087198.t failed in one run (new vm)
>>>>
>>>> The occasional failing tests aren't completely random, suggesting
>>>> something is going on.  Possible race conditions maybe? (no idea).
>>>>
>>>>   * 8 failures - bug-1087198.t
>>>>   * 5 failures - bug-948686.t
>>>>   * 4 failures - bug-1070734.t
>>>>   * 3 failures - bug-1023974.t
>>>>   * 3 failures - bug-857330/normal.t
>>>>   * 2 failures - bug-860663.t
>>>>   * 2 failures - bug-1004744.t
>>>>   * 1 failures - bug-857330/xml.t
>>>>   * 1 failures - bug-887098-gmount-crash.t
>>>>
>>>> Anyone have suggestions on how to make this work reliably?
>>>
>>>
>>> I think it would be a good idea to arrive at a list of test cases that
>>> are failing at random and assign owners to address them (default owner
>>> being the submitter of the test case). In addition to these, I have
>>> also seen tests like bd.t and xml.t fail pretty regularly.
>>>
>>> Justin - can we publish a consolidated list of regression tests that
>>> fail and owners for them on an etherpad or similar?
>>>
>>> Fixing these test cases will enable us to bring in more jenkins
>>> instances for parallel regression runs etc. and will also provide more
>>> determinism for our regression tests. Your help to address the
>>> regression test suite problems will be greatly appreciated!
>> Indeed, getting the regression tests stable seems like a blocker before
>> we can move to a scalable Jenkins solution. Unfortunately, it may not be
>> trivial to debug these test cases... Any suggestion on capturing useful
>> data that helps in figuring out why the test cases don't pass?
>>
>> Thanks,
>> Niels
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>