[Gluster-devel] test failure reports for last 15 days

Xavi Hernandez jahernan at redhat.com
Wed Apr 10 17:25:06 UTC 2019


On Wed, Apr 10, 2019 at 4:01 PM Atin Mukherjee <amukherj at redhat.com> wrote:

> And now for last 15 days:
>
> https://fstat.gluster.org/summary?start_date=2019-03-25&end_date=2019-04-10
>
> ./tests/bitrot/bug-1373520.t     18  ==> Fixed through
> https://review.gluster.org/#/c/glusterfs/+/22481/, I don't see this
> failing in brick mux post 5th April
> ./tests/bugs/ec/bug-1236065.t     17  ==> happens only in brick mux, needs
> analysis.
>

I've identified the problem here, but not the cause yet. There's a stale
inodelk acquired by a process that is already dead, which causes inodelk
requests from self-heal and other processes to block.

The reason why it seemed to block in random places is that all commands are
executed with the working directory pointing to a gluster directory which
needs healing after the initial tests. Because of the stale inodelk, when
any application tries to open a file in the working directory, it's blocked.

I'll investigate what causes this.

Xavi

./tests/basic/uss.t             15  ==> happens in both brick mux and non
> brick mux runs, test just simply times out. Needs urgent analysis.
> ./tests/basic/ec/ec-fix-openfd.t 13  ==> Fixed through
> https://review.gluster.org/#/c/22508/ , patch merged today.
> ./tests/basic/volfile-sanity.t      8  ==> Some race, though this succeeds
> in second attempt every time.
>
> There're plenty more with 5 instances of failure from many tests. We need
> all maintainers/owners to look through these failures and fix them, we
> certainly don't want to get into a stage where master is unstable and we
> have to lock down the merges till all these failures are resolved. So
> please help.
>
> (Please note fstat stats show up the retries as failures too which in a
> way is right)
>
>
> On Tue, Feb 26, 2019 at 5:27 PM Atin Mukherjee <amukherj at redhat.com>
> wrote:
>
>> [1] captures the test failures report since last 30 days and we'd need
>> volunteers/component owners to see why the number of failures are so high
>> against few tests.
>>
>> [1]
>> https://fstat.gluster.org/summary?start_date=2019-01-26&end_date=2019-02-25&job=all
>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190410/22af728c/attachment.html>


More information about the Gluster-devel mailing list