<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Wed, Oct 3, 2018 at 7:02 PM Shyam Ranganathan &lt;<a href="mailto:srangana@redhat.com">srangana@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 10/03/2018 05:36 AM, Pranith Kumar Karampuri wrote:<br>

&gt; <br>

&gt; <br>

&gt; On Thu, Sep 27, 2018 at 8:18 PM Shyam Ranganathan &lt;<a href="mailto:srangana@redhat.com" target="_blank">srangana@redhat.com</a><br>

&gt; &lt;mailto:<a href="mailto:srangana@redhat.com" target="_blank">srangana@redhat.com</a>&gt;&gt; wrote:<br>

&gt; <br>

&gt;     On 09/27/2018 10:05 AM, Atin Mukherjee wrote:<br>

&gt;     &gt;         Now does this mean we block commit rights for component Y till<br>

&gt;     &gt;         we have the root cause?<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt;     It was a way of making it someone&#39;s priority. If you have another<br>

&gt;     &gt;     way to make it someone&#39;s priority that is better than this, please<br>

&gt;     &gt;     suggest and we can have a discussion around it and agree on it<br>

&gt;     :-).<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt; This is what I can think of:<br>

&gt;     &gt;<br>

&gt;     &gt; 1. Component peers/maintainers take a first triage of the test<br>

&gt;     failure.<br>

&gt;     &gt; Do the initial debugging and (a) point to the component which needs<br>

&gt;     &gt; further debugging or (b) seek for help at gluster-devel ML for<br>

&gt;     &gt; additional insight for identifying the problem and narrowing down to a<br>

&gt;     &gt; component. <br>

&gt;     &gt; 2. If it’s (1 a) then we already know the component and the owner. If<br>

&gt;     &gt; it’s (2 b) at this juncture, it’s all maintainers responsibility to<br>

&gt;     &gt; ensure the email is well understood and based on the available details<br>

&gt;     &gt; the ownership is picked up by respective maintainers. It might be also<br>

&gt;     &gt; needed that multiple maintainers might have to be involved and this is<br>

&gt;     &gt; why I focus on this as a group effort than individual one.<br>

&gt; <br>

&gt;     In my thinking, acting as a group here is better than making it a<br>

&gt;     sub-groups/individuals responsibility. Which has been put forth by Atin<br>

&gt;     (IMO) well. Thus, keep the merge rights out for all (of course some<br>

&gt;     still need to have it), and get the situation addressed is better.<br>

&gt; <br>

&gt; <br>

&gt; In my experience, it has been rather difficult for developers without<br>

&gt; domain expertise to solve the problem (at least on the components I am<br>

&gt; maintaining), so the reality is that not everyone may be able to solve<br>

&gt; the issues on all the components where the problem is observed. May be<br>

&gt; you mean we need more participation  when you say we need to act as a<br>

&gt; group, so with that assumption one way to make that happen is to change<br>

&gt; the workflow around &#39;recheck centos&#39;. In my thinking following the tools<br>

&gt; shouldn&#39;t lead to less participation on gluster-devel where developers<br>

&gt; can just do recheck-centos until the test passes and be done. So maybe<br>

&gt; tooling should encourage participation. Maybe something like &#39;recheck<br>

&gt; centos &lt;link-to-mail-where-they-reported-it-on-gluster-devel&gt;&#39; This is<br>

&gt; just an idea, thoughts are welcome.<br>

<br>

I agree, any recheck should have enough reason behind it to state why<br>

the recheck is being attempted, and what the failures were, which are<br>

deemed spurious or otherwise to require a recheck.<br>

<br>

The manner of enforcing the same is not present yet, and is possibly an<br>

orthogonal discussion to the one here.<br>

<br>

The recheck stringency (and I would add even the retry a test if it<br>

fails once should be removed), will aid in getting to less frequent<br>

breakage in nightly, as more effort is put into correcting the tests or<br>

fixing the code around the same.<br>

<br>

Once we have distributed tests running, such that overall regression<br>

time is reduced, we can possibly tackle removing retries for tests, and<br>

then getting to a more stringent recheck process/tooling. The reason<br>

being, we now run to completion and that takes quite a bit of time, so<br>

at this juncture removing retry is not practical, but we should get<br>

there (soon?).<br></blockquote><div><br></div>I agree with you about removing retry. I didn&#39;t understand why recheck nudging developers  has to be post-poned till distributed regression tests comes into picture. My thinking is that it is more important to have it when tests take longer.<br></div><div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

&gt;  <br>

&gt; <br>

&gt;     _______________________________________________<br>

&gt;     maintainers mailing list<br>

&gt;     <a href="mailto:maintainers@gluster.org" target="_blank">maintainers@gluster.org</a> &lt;mailto:<a href="mailto:maintainers@gluster.org" target="_blank">maintainers@gluster.org</a>&gt;<br>

&gt;     <a href="https://lists.gluster.org/mailman/listinfo/maintainers" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/maintainers</a><br>

&gt; <br>

&gt; <br>

&gt; <br>

&gt; -- <br>

&gt; Pranith<br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith<br></div></div></div>