[Gluster-Maintainers] Lock down period merge process

Thu Sep 27 12:46:38 UTC 2018

On Thu, Sep 27, 2018 at 2:04 PM Nigel Babu <nigelb at redhat.com> wrote:

> We know maintainers of the components which are leading to repeated
>> failures in that component and we just need to do the same thing we did to
>> remove commit access for the maintainer of the component instead of all of
>> the people. So in that sense it is not good faith and can be enforced.
>>
>
> Pranith, I believe the difference of opinion is because you're looking at
> this problem in terms of "who" rather than "what". We do not care about
> *who* broke master. Removing commit access from a component owner doesn't
> stop someone else from landing a patch will create a failure in the same
> component or even a different component.
>

It is a long mail TL;DR
1) The problem with current approach is that it doesn't really change the
behavior which lead to the master lock down in the first place.
2) Forget about regular contributors, it doesn't really teach new
contributors the right behaviors to make sure master is in good shape. (If
my memory serves right) I recently saw Yaniv do a recheck centos on one of
his patches who is a new member to the community.

Let me explain the problem I have been facing on the components I maintain.
I know that there have been spurious failures in the components I maintain
and both I and the others on the team have other priorities which needed to
be addressed before picking these up and that never happened until master
lock down happened. I also don't care who broke the build but I know who
can fix it at least in the components I maintain. I need a process upstream
which will make it easier for me to  get help from others (most of the time
I myself can fix it). So if there are failures and after a point my commit
access is revoked then it becomes my top priority. So it is a way to make
that a priority for everyone working on the component. Because no one can
progress until the component becomes stable again. Just a miniature version
of what happens when the whole master lock down happens.

We cannot stop patches from landing because it touches a specific
> component. And even if we could, our components are not entirely
> independent of each other. There could still be failures. This is a common
> scenario and it happened the last time we had to close master. Let me
> further re-emphasize our goals:
>
> * When master is broken, every team member's energy needs to be focused on
> getting master to green. Who broke the build isn't a concern as much as
> *the build is broken*. This is not a situation to punish specific people.
>

I hope now you understand my intention is not to punish specific people. My
intention is to get more efficient as a community at addressing the problem
by the people who can, instead of waiting until it becomes so bad that
everyone has to stop their work and wait for people who can solve to solve
the problem. To give you an example: If there is a problem with AFR number
of people with domain expertise that I know of who can solve it is just 3.
But it is difficult to get those 3 (including me) to look at that with
priority, not because we don't want to. But there are people who keep
asking what happened with a specific issue at the place I work and this
gets on to the back burner. I want a process upstream through which I can
show as to why I need to work on spurious failures with more priority than
the issue they asked me to work on.

At this point one question could be, why not treat the master lock down as
that event/process instead of component maintainer lock down. It took
almost a year for the spurious failures to be looked into with priority in
the components I maintain. I personally don't like it to be that long and I
don't want others who have been doing a good job of maintaining their own
components to stop progressing on their own work because of me. So
component lock down seemed better all along I was suggesting it. Other
solutions that can solve the problems I raised are welcome :-).

* If we allow other commits to land, we run the risk of someone else
> breaking master with a different patch. Now we have two failures to debug
> and fix.
>

In the example above, we will in all cases have 2 failures to debug and
fix, just that it will be one after the other instead of in parallel if we
lock down master.

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/maintainers/attachments/20180927/e7449721/attachment.html>