[Gluster-devel] Problems with ec/nfs.t in regression tests

Xavier Hernandez xhernandez at datalab.es
Wed Feb 11 15:30:01 UTC 2015


Thanks for the information. I'll do some tests changing the number of 
threads for epoll.

Xavi

On 02/11/2015 04:20 PM, Shyam wrote:
> On 02/11/2015 09:40 AM, Xavier Hernandez wrote:
>> Hi,
>>
>> it seems that there are some failures in ec/nfs.t test on regression
>> tests. Doing some investigation I've found that before applying the
>> multi-threaded patch (commit 5e25569e) the problem does not seem to
>> happen.
>
> This has in interesting history in failures, on the regression runs for
> the MT epoll this (i.e ec/nfs.t) did not fail (there were others, but
> not nfs.t).
>
> The patch that allows configuration of MT epoll is where this started
> failing around Feb 5th (but later passed). (see patchset 7 failures on,
> http://review.gluster.org/#/c/9488/ )
>
> I state the above, as it may help narrowing down the changes in EC
> (maybe) that could have caused it.
>
> Also in the latter commit, there was an error configuring the number of
> threads so all regression runs would have run with a single epoll thread
> (the MT epoll patch had this hard coded, so that would have run with 2
> threads, but did not show up the issue (patch:
> http://review.gluster.org/#/c/3842/)).
>
> Again I state the above, as this should not be exposing a
> race/bug/problem due to the multi threaded nature of epoll, but of
> course needs investigation.
>
>>
>> I'm not sure if this patch is the cause or it has revealed some bug in
>> ec or any other xlator.
>
> I guess we can reproduce this issue? If so I would try setting
> client.event-threads on master branch to 1, restarting the volume and
> then running the test (as a part of the test itself maybe) to eliminate
> the possibility that MT epoll is causing it.
>
> My belief on MT epoll causing it is in doubt as the runs failed on the
> http://review.gluster.org/#/c/9488/ (configuration patch), which had the
> thread count as 1 due to a bug in that code.
>
>>
>> I can try to identify it (any help will be appreciated), but it may take
>> some time. Would it be better to remove the test in the meantime ?
>
> I am checking if this is reproducible on my machine, so that I can
> possibly see what is going wrong.
>
> Shyam
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list