<div dir="ltr">I think we should look for the root cause of these failures. If we mark the tests as Bad, the tests might go left behind. If someone is ready to own the tests and keep track of the on-going efforts of root causing them, it makes sense to mark them as Bad.<div><br></div><div>One more thought I have is, let's have a deadline discussed and fixed in an upcoming community meeting. In the meeting let's own the failures and fix them by the deadline. (If everyone agrees!)</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Aug 17, 2020 at 12:58 PM Deepshikha Khandelwal <<a href="mailto:dkhandel@redhat.com">dkhandel@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br></div><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Aug 15, 2020 at 7:33 PM Amar Tumballi <<a href="mailto:amar@kadalu.io" target="_blank">amar@kadalu.io</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">If I look at the recent regression runs (<a href="https://build.gluster.org/job/centos7-regression/" target="_blank">https://build.gluster.org/job/centos7-regression/</a>), there is more than 50% failure in tests.<div><br></div><div>At least 90% of the failures are not due to the patch itself. Considering regression tests are very critical for our patches to get merged, and takes almost 6-7 hours now a days to complete, how can we make sure we are passing regression with 100% certainty ?</div><div><br></div><div>Again, out of this, there are only a few tests which keep failing, should we revisit the tests and see why it is failing? or Should we mark them as 'Good if it passes, but don't fail regression if the tests fail' condition?</div><div><br></div></div></blockquote><div>I think we should revisit these tests for the root cause. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div><div>Some tests I have listed here from recent failures:</div><div><br></div><div><span style="white-space:pre-wrap">tests/bugs/core/multiplex-</span><span style="white-space:pre-wrap">limit-issue-151.t</span><br></div><div><span style="white-space:pre-wrap">tests/bugs/distribute/bug-</span><span style="white-space:pre-wrap">1122443.t +++</span><br></div><div><span style="white-space:pre-wrap">tests/bugs/distribute/bug-</span><span style="white-space:pre-wrap">1117851.t</span><span style="white-space:pre-wrap"><br></span></div><div><span style="white-space:pre-wrap">tests/bugs/glusterd/bug-</span><span style="white-space:pre-wrap">857330/normal.t +</span><span style="white-space:pre-wrap"><br></span></div><div><span style="white-space:pre-wrap">tests/basic/mount-nfs-auth.t +++++ </span></div></div></blockquote><div>It failed mainly on builder202. I disconnected the builder and will check what is going wrong. Though I don't have any full proof analysis on this one as it has been always flaky(failing quite randomly) </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><br></div><div><span style="white-space:pre-wrap">tests/basic/changelog/</span><span style="white-space:pre-wrap">changelog-snapshot.t</span><br></div><div><span style="white-space:pre-wrap">tests/basic/afr/split-brain-</span><span style="white-space:pre-wrap">favorite-child-policy.t</span><br></div><div><div><span style="white-space:pre-wrap">tests/basic/distribute/</span><span style="white-space:pre-wrap">rebal-all-nodes-migrate.t</span><span style="white-space:pre-wrap"><br></span></div><div></div></div><div><span style="white-space:pre-wrap">tests/bugs/glusterd/quorum-</span><span style="white-space:pre-wrap">value-check.t</span><br></div><div><span style="white-space:pre-wrap">tests/features/lock-</span><span style="white-space:pre-wrap">migration/lkmigration-set-</span><span style="white-space:pre-wrap">option.t</span><span style="white-space:pre-wrap"><br></span></div><div><span style="white-space:pre-wrap">tests/bugs/nfs/bug-1116503.t</span><span style="white-space:pre-wrap"><br></span></div><div><span style="white-space:pre-wrap">tests/basic/ec/ec-quorum-</span><span style="white-space:pre-wrap">count-partial-</span><span style="white-space:pre-wrap">failure</span><span style="white-space:pre-wrap">.t</span><span style="white-space:pre-wrap"><br></span></div><div><br></div><div>Considering these are just 12 of 750+ tests we run, Should we even consider marking them bad till they are fixed to be 100% consistent?</div></div></blockquote><div>Makes sense. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><br></div><div>Any thoughts on how we should go ahead?</div><div><br></div><div>Regards,</div><div>Amar</div><div><br></div><div>(+) indicates a count, so more + you see against the file, more times that failed.</div><div><br></div></div>
_______________________________________________<br>
maintainers mailing list<br>
<a href="mailto:maintainers@gluster.org" target="_blank">maintainers@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/maintainers" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/maintainers</a><br>
</blockquote></div></div>
</div>
_______________________________________________<br>
maintainers mailing list<br>
<a href="mailto:maintainers@gluster.org" target="_blank">maintainers@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/maintainers" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/maintainers</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div>Thanks,<br></div>Sanju<br></div></div>