[Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

Nithya Balachandran nbalacha at redhat.com
Mon Aug 6 12:51:17 UTC 2018


On 6 August 2018 at 18:03, Nithya Balachandran <nbalacha at redhat.com> wrote:

>
>
> On 2 August 2018 at 05:46, Shyam Ranganathan <srangana at redhat.com> wrote:
>
>> Below is a summary of failures over the last 7 days on the nightly
>> health check jobs. This is one test per line, sorted in descending order
>> of occurrence (IOW, most frequent failure is on top).
>>
>> The list includes spurious failures as well, IOW passed on a retry. This
>> is because if we do not weed out the spurious errors, failures may
>> persist and make it difficult to gauge the health of the branch.
>>
>> The number at the end of the test line are Jenkins job numbers where
>> these failed. The job numbers runs as follows,
>> - https://build.gluster.org/job/regression-test-burn-in/ ID: 4048 - 4053
>> - https://build.gluster.org/job/line-coverage/ ID: 392 - 407
>> - https://build.gluster.org/job/regression-test-with-multiplex/ ID: 811
>> - 817
>>
>> So to get to job 4051 (say), use the link
>> https://build.gluster.org/job/regression-test-burn-in/4051
>>
>> Atin has called out some folks for attention to some tests, consider
>> this a call out to others, if you see a test against your component,
>> help around root causing and fixing it is needed.
>>
>> tests/bugs/core/bug-1432542-mpx-restart-crash.t, 4049, 4051, 4052, 405,
>> 404, 403, 396, 392
>>
>> tests/00-geo-rep/georep-basic-dr-tarssh.t, 811, 814, 817, 4050, 4053
>>
>> tests/bugs/bug-1368312.t, 815, 816, 811, 813, 403
>>
>> tests/bugs/distribute/bug-1122443.t, 4050, 407, 403, 815, 816
>>
>> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t,
>> 814, 816, 817, 812, 815
>>
>> tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-
>> on-quorum-failure.t,
>> 4049, 812, 814, 405, 392
>>
>> tests/bitrot/bug-1373520.t, 811, 816, 817, 813
>>
>> tests/bugs/ec/bug-1236065.t, 812, 813, 815
>>
>> tests/00-geo-rep/georep-basic-dr-rsync.t, 813, 4046
>>
>> tests/basic/ec/ec-1468261.t, 817, 812
>>
>> tests/bugs/glusterd/quorum-validation.t, 4049, 407
>>
>> tests/bugs/quota/bug-1293601.t, 811, 812
>>
>> tests/basic/afr/add-brick-self-heal.t, 407
>>
>> tests/basic/afr/granular-esh/replace-brick.t, 392
>>
>> tests/bugs/core/multiplex-limit-issue-151.t, 405
>>
>> tests/bugs/distribute/bug-1042725.t, 405
>>
>> tests/bugs/distribute/bug-1117851.t, 405
>>
>
> From the non-lcov vs lcov runs:
>
> Non-lcov:
>
> [nbalacha at myserver glusterfs]$ grep TEST mnt-glusterfs-0.log
> [2018-07-31 16:30:36.930726]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 72 create_files /mnt/glusterfs/0 ++++++++++
> [2018-07-31 16:31:47.649022]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 75 glusterfs --entry-timeout=0 --attribute-timeout=0 -s
> builder104.cloud.gluster.org --volfile-id patchy /mnt/glusterfs/1
> ++++++++++
> [2018-07-31 16:31:47.746734]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 77 move_files /mnt/glusterfs/0 ++++++++++
> [2018-07-31 16:31:47.783606]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 78 move_files /mnt/glusterfs/1 ++++++++++
> [2018-07-31 16:31:47.842878]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 85 done cat /mnt/glusterfs/0/status_0 ++++++++++
> [2018-07-31 16:33:14.849807]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 86 done cat /mnt/glusterfs/1/status_1 ++++++++++
> [2018-07-31 16:33:14.872184]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 88 Y force_umount /mnt/glusterfs/0 ++++++++++
> [2018-07-31 16:33:14.900334]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 89 Y force_umount /mnt/glusterfs/1 ++++++++++
> [2018-07-31 16:33:14.929238]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 90 glusterfs --entry-timeout=0 --attribute-timeout=0 -s
> builder104.cloud.gluster.org --volfile-id patchy /mnt/glusterfs/0
> ++++++++++
> [2018-07-31 16:33:15.027094]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 91 check_files /mnt/glusterfs/0 ++++++++++
> [2018-07-31 16:33:20.268030]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 93 gluster --mode=script --wignore volume stop patchy ++++++++++
> [2018-07-31 16:33:22.392247]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 94 Stopped volinfo_field patchy Status ++++++++++
> [2018-07-31 16:33:22.492175]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 96 gluster --mode=script --wignore volume delete patchy ++++++++++
> [2018-07-31 16:33:25.475566]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 97 ! gluster --mode=script --wignore volume info patchy ++++++++++
>
>
> Total time for the tests: *169* seconds
>
>
> Lcov:
>
> [nbalacha at myserver glusterfs]$ grep TEST mnt-glusterfs-0.log
> [2018-08-06 08:33:05.737012]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 72 create_files /mnt/glusterfs/0 ++++++++++
> [2018-08-06 08:34:29.133045]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 75 glusterfs --entry-timeout=0 --attribute-timeout=0 -s
> builder100.cloud.gluster.org --volfile-id patchy /mnt/glusterfs/1
> ++++++++++
> [2018-08-06 08:34:29.257888]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 77 move_files /mnt/glusterfs/0 ++++++++++
> [2018-08-06 08:34:29.306725]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 78 move_files /mnt/glusterfs/1 ++++++++++
> [2018-08-06 08:34:29.372790]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 85 done cat /mnt/glusterfs/0/status_0 ++++++++++
> [2018-08-06 08:36:12.934406]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 86 done cat /mnt/glusterfs/1/status_1 ++++++++++
> [2018-08-06 08:36:12.970541]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 88 Y force_umount /mnt/glusterfs/0 ++++++++++
> [2018-08-06 08:36:13.009152]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 89 Y force_umount /mnt/glusterfs/1 ++++++++++
> [2018-08-06 08:36:13.046645]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 90 glusterfs --entry-timeout=0 --attribute-timeout=0 -s
> builder100.cloud.gluster.org --volfile-id patchy /mnt/glusterfs/0
> ++++++++++
> [2018-08-06 08:36:13.163337]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 91 check_files /mnt/glusterfs/0 ++++++++++
> [2018-08-06 08:36:19.175564]:++++++++++ G_LOG:./tests/bugs/distribute/bug-1117851.t:
> TEST: 93 gluster --mode=script --wignore volume stop patchy ++++++++++
>
> Total time: *194* seconds
>
>
> We need to increase the timeouts for the line cov tests.
>


Checked a few more regression runs for this test:

./tests/bugs/distribute/bug-1117851.t  -  179 second
./tests/bugs/distribute/bug-1117851.t  -  182 second
./tests/bugs/distribute/bug-1117851.t  -  178 second


Looks like the lcov pushed it over the 200s limit.



>
> regards,
> Nithya
>
>>
>> tests/bugs/glusterd/rebalance-operations-in-single-node.t, 405
>>
>> tests/bugs/index/bug-1559004-EMLINK-handling.t, 405
>>
>> tests/bugs/replicate/bug-1386188-sbrain-fav-child.t, 4048
>>
>> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t, 813
>>
>>
>>
>> Thanks,
>> Shyam
>>
>>
>> On 07/30/2018 03:21 PM, Shyam Ranganathan wrote:
>> > On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
>> >> 1) master branch health checks (weekly, till branching)
>> >>   - Expect every Monday a status update on various tests runs
>> >
>> > See https://build.gluster.org/job/nightly-master/ for a report on
>> > various nightly and periodic jobs on master.
>> >
>> > RED:
>> > 1. Nightly regression (3/6 failed)
>> > - Tests that reported failure:
>> > ./tests/00-geo-rep/georep-basic-dr-rsync.t
>> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>> > ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-
>> on-quorum-failure.t
>> > ./tests/bugs/distribute/bug-1122443.t
>> >
>> > - Tests that needed a retry:
>> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t
>> > ./tests/bugs/glusterd/quorum-validation.t
>> >
>> > 2. Regression with multiplex (cores and test failures)
>> >
>> > 3. line-coverage (cores and test failures)
>> > - Tests that failed:
>> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t (patch
>> > https://review.gluster.org/20568 does not fix the timeout entirely, as
>> > can be seen in this run,
>> > https://build.gluster.org/job/line-coverage/401/consoleFull )
>> >
>> > Calling out to contributors to take a look at various failures, and post
>> > the same as bugs AND to the lists (so that duplication is avoided) to
>> > get this to a GREEN status.
>> >
>> > GREEN:
>> > 1. cpp-check
>> > 2. RPM builds
>> >
>> > IGNORE (for now):
>> > 1. clang scan (@nigel, this job requires clang warnings to be fixed to
>> > go green, right?)
>> >
>> > Shyam
>> > _______________________________________________
>> > Gluster-devel mailing list
>> > Gluster-devel at gluster.org
>> > https://lists.gluster.org/mailman/listinfo/gluster-devel
>> >
>> _______________________________________________
>> maintainers mailing list
>> maintainers at gluster.org
>> https://lists.gluster.org/mailman/listinfo/maintainers
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180806/04792427/attachment-0001.html>


More information about the Gluster-devel mailing list