[Gluster-Maintainers] Lock down period merge process

Wed Aug 22 12:24:03 UTC 2018

On 08/18/2018 12:45 AM, Pranith Kumar Karampuri wrote:
> 
> 
> On Tue, Aug 14, 2018 at 5:29 PM Shyam Ranganathan <srangana at redhat.com
> <mailto:srangana at redhat.com>> wrote:
> 
>     On 08/09/2018 01:24 AM, Pranith Kumar Karampuri wrote:
>     >
>     >
>     > On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan
>     <srangana at redhat.com <mailto:srangana at redhat.com>
>     > <mailto:srangana at redhat.com <mailto:srangana at redhat.com>>> wrote:
>     >
>     >     Maintainers,
>     >
>     >     The following thread talks about a merge during a merge
>     lockdown, with
>     >     differing view points (this mail is not to discuss the view
>     points).
>     >
>     >     The root of the problem is that we leave the current process
>     to good
>     >     faith. If we have a simple rule that we will not merge
>     anything during a
>     >     lock down period, this confusion and any future repetitions of
>     the same
>     >     would not occur.
>     >
>     >     I propose that we follow the simpler rule, and would like to hear
>     >     thoughts around this.
>     >
>     >     This also means that in the future, we may not need to remove
>     commit
>     >     access for other maintainers, as we do *not* follow a good
>     faith policy,
>     >     and instead depend on being able to revert and announce on the
>     threads
>     >     why we do so.
>     >
>     >
>     > I think it is a good opportunity to establish guidelines and
>     process so
>     > that we don't end up in this state in future where one needs to lock
>     > down the branch to make it stable. From that p.o.v. discussion on this
>     > thread about establishing a process for lock down probably doesn't add
>     > much value. My personal opinion for this instance at least is that
>     it is
>     > good that it was locked down. I tend to forget things and not
>     having the
>     > access to commit helped follow the process automatically :-).
> 
>     The intention is that master and release branches are always maintained
>     in good working order. This involves, tests and related checks passing
>     *always*.
> 
>     When this situation is breached, correcting it immediately is better
>     than letting it build up, as that would entail longer times and more
>     people to fix things up.
> 
>     In an ideal world, if nightly runs fail, the next thing done would be to
>     examine patches that were added between the 2 runs, and see if they are
>     the cause for failure, and back them out.
> 
>     Hence calling to backout patches is something that would happen more
>     regularly in the future if things are breaking.
> 
> 
> I'm with you till here.
>  
> 
> 
>     Lock down may happen if 2 consecutive nightly builds fail, so as to
>     rectify the situation ASAP, and then move onto other work.
> 
>     In short, what I wanted to say is that preventing lock downs in the
>     future, is not a state we aspire for. 
> 
> 
> What are the problems you foresee in aspiring for preventing lock downs
> for everyone?

Any project will have test failures, when things are put together and
exercised in different environments.

For us, these environments, at present, are nightly regression on
centOS7, Brick-mux enabled regression, lcov, clang and in the future we
hope to increase this to ASAN, performance baselines, memory leak
checks, etc.

The whole intent of adding such tests and checks to the test pipeline,
is to ensure that regressions to the good state, are caught early and
regularly.

When these tests and checks actually show up errors/issues, is when we
need to pay attention to the same and focus first on getting us back on
track.

At the above juncture, is when we need the lockdown or commit blackout
(or whatever we want to call it). The intent is to ensure that we do not
add more patches and further destabilize the branch, but stabilize it
first and then get other changes in later.

There is a certain probability with which the above event will happen,
and that can be reduced, if we are more stringent in our development
practices, and ensuring code health is maintained (both by more checks
in smoke and more tests per patch or even otherwise, and also by keener
reviews and other such means).

We also need to be proactive to, monitoring and fixing, failures in the
tests, so that we can address them quickly, rather than someone calling
out a lockdown.

Now, is your question that we should avoid the above state altogether?
Or, how to retain commit access for all, but still have such states as
above, where only patches that help stabilization are merged?

For the former, I do not have an answer, we can reduce the event
probability as stated above, but it will never disappear (which means
all those tests are pretty much of no value and needs to improve).

For the latter, it circles back to what I stated in earlier responses,
on good faith versus process, and we would like to stick to a process
and keep the commit list short, rather than use good faith to avoid any
inadvertent "I missed the mail" cases.

>  
> 
>     Lock downs may/will happen, it is
>     done to get the branches stable quicker, than spend long times trying to
>     find what caused the instability in the first place.
> 
> 
> 
>  
> 
> 
>     >  
>     >
>     >
>     >     Please note, if there are extraneous situations (say reported
>     security
>     >     vulnerabilities that need fixes ASAP) we may need to loosen up the
>     >     stringency, as that would take precedence over the lock down.
>     These
>     >     exceptions though, can be called out and hence treated as such.
>     >
>     >     Thoughts?
>     >
>     >
>     > This is again my personal opinion. We don't need to merge patches
>     in any
>     > branch unless we need to make an immediate release with that
>     patch. For
>     > example if there is a security issue reported we *must* make a release
>     > with the fix immediately so it makes sense to merge it and do the
>     release.
> 
>     Agree, keeps the rule simple during lock down and not open to
>     interpretations.
> 
>     >  
>     >
>     >
>     >     Shyam
>     >
>     >     PS: Added Yaniv to the CC as he reported the deviance
>     >
>     >     -------- Forwarded Message --------
>     >     Subject:        Re: [Gluster-devel] Release 5: Master branch
>     health
>     >     report
>     >     (Week of 30th July)
>     >     Date:   Tue, 7 Aug 2018 23:22:09 +0300
>     >     From:   Yaniv Kaul <ykaul at redhat.com <mailto:ykaul at redhat.com>
>     <mailto:ykaul at redhat.com <mailto:ykaul at redhat.com>>>
>     >     To:     Shyam Ranganathan <srangana at redhat.com
>     <mailto:srangana at redhat.com>
>     >     <mailto:srangana at redhat.com <mailto:srangana at redhat.com>>>
>     >     CC:     GlusterFS Maintainers <maintainers at gluster.org
>     <mailto:maintainers at gluster.org>
>     >     <mailto:maintainers at gluster.org
>     <mailto:maintainers at gluster.org>>>, Gluster Devel
>     >     <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>
>     <mailto:gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>>,
>     >     Nigel Babu <nigelb at redhat.com <mailto:nigelb at redhat.com>
>     <mailto:nigelb at redhat.com <mailto:nigelb at redhat.com>>>
>     >
>     >
>     >
>     >
>     >
>     >     On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan
>     <srangana at redhat.com <mailto:srangana at redhat.com>
>     >     <mailto:srangana at redhat.com <mailto:srangana at redhat.com>>
>     >     <mailto:srangana at redhat.com <mailto:srangana at redhat.com>
>     <mailto:srangana at redhat.com <mailto:srangana at redhat.com>>>> wrote:
>     >
>     >         On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
>     >         >     The intention is to stabilize master and not add
>     more patches
>     >         that my
>     >         >     destabilize it.
>     >         >
>     >         >
>     >         > https://review.gluster.org/#/c/20603/ has been merged.
>     >         > As far as I can see, it has nothing to do with
>     stabilization and
>     >         should
>     >         > be reverted.
>     >
>     >         Posted this on the gerrit review as well:
>     >
>     >         <snip>
>     >         4.1 does not have nightly tests, those run on master only.
>     >
>     >
>     >     That should change of course. We cannot strive for stability
>     otherwise,
>     >     AFAIK. 
>     >
>     >
>     >         Stability of master does not (will not), in the near term
>     guarantee
>     >         stability of release branches, unless patches that impact code
>     >     already
>     >         on release branches, get fixes on master and are back ported.
>     >
>     >         Release branches get fixes back ported (as is normal),
>     this fix
>     >     and its
>     >         merge should not impact current master stability in any
>     way, and
>     >     neither
>     >         stability of 4.1 branch.
>     >         </snip>
>     >
>     >         The current hold is on master, not on release branches. I
>     agree that
>     >         merging further code changes on release branches (for example
>     >     geo-rep
>     >         issues that are backported (see [1]), as there are tests
>     that fail
>     >         regularly on master), may further destabilize the release
>     >     branch. This
>     >         patch is not one of those.
>     >
>     >
>     >     Two issues I have with the merge:
>     >     1. It just makes comparing master branch to release branch
>     harder. For
>     >     example, to understand if there's a test that fails on master but
>     >     succeeds on release branch, or vice versa. 
>     >     2. It means we are not focused on stabilizing master branch.
>     >     Y.
>     >
>     >
>     >         Merging patches on release branches are allowed by release
>     >     owners only,
>     >         and usual practice is keeping the backlog low (merging weekly)
>     >     in these
>     >         cases as per the dashboard [1].
>     >
>     >         Allowing for the above 2 reasons this patch was found,
>     >         - Not on master
>     >         - Not stabilizing or destabilizing the release branch
>     >         and hence was merged.
>     >
>     >         If maintainers disagree I can revert the same.
>     >
>     >         Shyam
>     >
>     >         [1] Release 4.1 dashboard:
>     >
>     >   
>      https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard
>     >
>     >     _______________________________________________
>     >     maintainers mailing list
>     >     maintainers at gluster.org <mailto:maintainers at gluster.org>
>     <mailto:maintainers at gluster.org <mailto:maintainers at gluster.org>>
>     >     https://lists.gluster.org/mailman/listinfo/maintainers
>     >
>     >
>     >
>     > --
>     > Pranith
> 
> 
> 
> -- 
> Pranith