[Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)
Kotresh Hiremath Ravishankar
khiremat at redhat.com
Thu Aug 2 06:27:37 UTC 2018
On Thu, Aug 2, 2018 at 11:43 AM, Xavi Hernandez <xhernandez at redhat.com>
wrote:
> On Thu, Aug 2, 2018 at 6:14 AM Atin Mukherjee <amukherj at redhat.com> wrote:
>
>>
>>
>> On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee <amukherj at redhat.com>
>> wrote:
>>
>>> I just went through the nightly regression report of brick mux runs and
>>> here's what I can summarize.
>>>
>>> ============================================================
>>> ============================================================
>>> =================================================
>>> Fails only with brick-mux
>>> ============================================================
>>> ============================================================
>>> =================================================
>>> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after
>>> 400 secs. Refer https://fstat.gluster.org/failure/209?state=2&start_
>>> date=2018-06-30&end_date=2018-07-31&branch=all, specifically the latest
>>> report https://build.gluster.org/job/regression-test-burn-in/4051/
>>> consoleText . Wasn't timing out as frequently as it was till 12 July.
>>> But since 27 July, it has timed out twice. Beginning to believe commit
>>> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now
>>> 400 secs isn't sufficient enough (Mohit?)
>>>
>>> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>>> (Ref - https://build.gluster.org/job/regression-test-with-
>>> multiplex/814/console) - Test fails only in brick-mux mode, AI on Atin
>>> to look at and get back.
>>>
>>> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
>>> https://build.gluster.org/job/regression-test-with-multiplex/813/console)
>>> - Seems like failed just twice in last 30 days as per
>>> https://fstat.gluster.org/failure/251?state=2&start_
>>> date=2018-06-30&end_date=2018-07-31&branch=all. Need help from AFR team.
>>>
>>> tests/bugs/quota/bug-1293601.t (https://build.gluster.org/
>>> job/regression-test-with-multiplex/812/console) - Hasn't failed after
>>> 26 July and earlier it was failing regularly. Did we fix this test through
>>> any patch (Mohit?)
>>>
>>> tests/bitrot/bug-1373520.t - (https://build.gluster.org/
>>> job/regression-test-with-multiplex/811/console) - Hasn't failed after
>>> 27 July and earlier it was failing regularly. Did we fix this test through
>>> any patch (Mohit?)
>>>
>>
>> I see this has failed in day before yesterday's regression run as well
>> (and I could reproduce it locally with brick mux enabled). The test fails
>> in healing a file within a particular time period.
>>
>> *15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19* FAILED COMMAND: 512 path_size /d/backends/patchy5/FILE1
>>
>> Need EC dev's help here.
>>
>
> I'll investigate this.
>
>
>>
>>
>>> tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a core,
>>> not sure if related to brick mux or not, so not sure if brick mux is
>>> culprit here or not. Ref - https://build.gluster.org/job/
>>> regression-test-with-multiplex/806/console . Seems to be a glustershd
>>> crash. Need help from AFR folks.
>>>
>>> ============================================================
>>> ============================================================
>>> =================================================
>>> Fails for non-brick mux case too
>>> ============================================================
>>> ============================================================
>>> =================================================
>>> tests/bugs/distribute/bug-1122443.t 0 Seems to be failing at my setup
>>> very often, with out brick mux as well. Refer
>>> https://build.gluster.org/job/regression-test-burn-in/4050/consoleText
>>> . There's an email in gluster-devel and a BZ 1610240 for the same.
>>>
>>> tests/bugs/bug-1368312.t - Seems to be recent failures (
>>> https://build.gluster.org/job/regression-test-with-multiplex/815/console)
>>> - seems to be a new failure, however seen this for a non-brick-mux case too
>>> - https://build.gluster.org/job/regression-test-burn-in/4039/consoleText
>>> . Need some eyes from AFR folks.
>>>
>>> tests/00-geo-rep/georep-basic-dr-tarssh.t - this isn't specific to
>>> brick mux, have seen this failing at multiple default regression runs.
>>> Refer https://fstat.gluster.org/failure/392?state=2&start_
>>> date=2018-06-30&end_date=2018-07-31&branch=all . We need help from
>>> geo-rep dev to root cause this earlier than later
>>>
>>> tests/00-geo-rep/georep-basic-dr-rsync.t - this isn't specific to brick
>>> mux, have seen this failing at multiple default regression runs. Refer
>>> https://fstat.gluster.org/failure/393?state=2&start_
>>> date=2018-06-30&end_date=2018-07-31&branch=all . We need help from
>>> geo-rep dev to root cause this earlier than later
>>>
>>
I have posted the patch [1] for above two. This should handle connection
time outs without any logs. But I still see a strange behaviour now and then
where the one of the worker doesn't get started at all. I am debugging that
with instrumentation patch [2]. I am not hitting that on this frequently.
I will continue to work on that. But patch [1] should reduce failures
considerably.
[1] https://review.gluster.org/#/c/20601/
[2] https://review.gluster.org/#/c/20477/
>>> tests/bugs/glusterd/validating-server-quorum.t (
>>> https://build.gluster.org/job/regression-test-with-multiplex/810/console)
>>> - Fails for non-brick-mux cases too, https://fstat.gluster.org/
>>> failure/580?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all
>>> . Atin has a patch https://review.gluster.org/20584 which resolves it
>>> but patch is failing regression for a different test which is unrelated.
>>>
>>> tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
>>> (Ref - https://build.gluster.org/job/regression-test-with-
>>> multiplex/809/console) - fails for non brick mux case too -
>>> https://build.gluster.org/job/regression-test-burn-in/4049/consoleText
>>> - Need some eyes from AFR folks.
>>>
>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
--
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180802/1cb21e70/attachment.html>
More information about the Gluster-devel
mailing list