[Gluster-devel] [Heketi] How pushing to heketi happens - especially about squashing

Michael Adam madam at redhat.com
Tue Sep 20 14:56:27 UTC 2016

----- Original Message -----
> Hi Michael,
>   We have a new mailing list, it is gluster-devel with [Heketi] in the
> subject.

Hi Luis,

I know, which is what triggered that other mail. :-) It is not really a new
mailing-list but a [Heketi] tag to be used in an old mailing list,
and it was also said to use it for 'gluster related' discussions
in heketi and not for general heketi dev discussions, which kind of soft
and unclear to me.
I was suggesting a mailing list for *all* heketi development related
discussions. This stuff here is not necessary interesting for the broader
gluster development community. But since you pulled the mail over from the
internal list, to gluster-devel, I'm going to reply here. :-)

> On the concept of github. It is always interesting to compare what we
> know and how we are used to something to something new. In Github,
> we do not need to let Github squash at all.  I was doing that as a 'pilot'.

Yeah, my request was to drop that because it creates bad commit messages
and may not be what the author intended. It fiddles with the author's patches,
which a UI should never do imho. That's just scary.

> The real method is for patches to be added to a PR, and if too many
> patches are added, for the author to squash them, and send a new one.
> This is documented in the Development Guide in the Wiki.

Yeah, I am questioning that, because I think it is flawed:

1) "too many patches ==> squash" is imho the wrong decision point.
   A PR can have as many commits as the author wants, as long as
   each commit is logical and complete. I even generally encourage
   more atomic patches. So imho that rule 'many patches ==> squash'
   creates the wrong incentive for squashing.

2) After the squashing, the author should imho not create a new
   PR but update the existing one. It has all the context and history
   for the patchset.

> The author should also note that their first patch/commit sent as
> a PR is the information used as the PR.  Lots of PRs are being sent
> with almost no information, and I have let this happen because most
> people are still ramping up.

Note that the commit message is NOT the PR title. That only happens
if you use github to squash... So if we don't use github to squash,
the PR title and initial description are less important for the final result!

> There is no reason why commit messages cannot be as detailed as
> those from Gerrit. 

No idea what gerrit has to do with commit messages.
Commit messages always come from the author, unless you
let some sofware fiddle with it. ;-)

> Here is an example:
> https://github.com/heketi/heketi/pull/393 .

I am a big fan of long commit messages, myself guilty of commits
with a message much longer than the actual patch.

And the PR title is by default only the message of the first
commit in the series, I think. So one should add content describing
the proposed patchset when creating the PR. Full agreement here.

But I think we should not over-estimate the title/description of
the PR. E.g. here are two examples of PRs that had a separate
title/description and the actual more detailed info was in the
commit messages of the patches:


Those were mangled together by a github-squash-push.

> The process to update changes is to update the forked
> branch, and not to amend the same change.  Amending makes it impossible
> to determine the changes from patch to patch, and makes it extremely hard
> on reviewers (me).

Hmm. My request is exactly to do amends. It is just standard and good (imho)
and even necessary git workflow. See below for more details on that under
comments to point #3.

> Here are my thoughts on your questions below:
> 1) The the review should not squash the authors commits unless
>    the author explicitly requests or approves that.
> [lpabon] Absolutely.  The pilot, although it worked well technically,
> it confuses those who come from other source control systems.

Sorry it was getting kinda jet-laggy late so my words were not the best...
I wanted to say:

The reviewer should not squash (or otherwise change) the author's
commits unless explicitly requested or approved, irrespective of
the tool used for doing the changes.

> 2) We should avoid using github to merge because this creates
>    bad commit messages.
> [lpabon] I'm not sure what you mean by this, but I would not
> "avoid" github in any way.  That is like saying "avoid Gerrit".

I wanted to say: "In particular, and especially, we should not
use Gerrit for doing squashes. It creates bad commit messages."
And yeah, I do explicitly mean "Avoid that aspect of github."
Like "stay away from any feature in github that change the
original commits". (Not sure if there is more lurking. ;-)

> 3) (As a consequence of the above,) If we push delta-patches
>    to update PRs, that can usually not be the final push, but
>    needs a final iteration of force-pushing an amended patchset.
> [lpabon] Do not amend patches.
> NOTE on amended patches.  If I notice another one, I will *not* merge
> the change.  Sorry to be a pain about that, but it makes it almost
> impossible to review.  This is not Gerrit, this is Github, it
> is something new, but in my opinion, it is more natural git workflow.

I disagree. And I stand by my point that while we can do those
delta-patches for intermediate WIP-updates of a patchset, the ultimate version
of the patchset *needs* to have these delta-patches squashed into
the patches and be a patchset that can be merged as-is. Those delta
patches should imho *never* be pushed to the upstream repo. This
would be bad habit. They need to be squashed/amended into the patchset,
because their existence means that the original patch was not
complete / correct. This is a necessary step in a good git workflow.
The end result is what counts most.

So are you saying that you will reject amended patchsets when
pushed to the *same* PR, but accept them when the original PR
is closed and the amended patchset is pushed to a *new* PR?

I don't understand the reasoning, because that sacrifices the
context and history of the original PR, which might be helpful,
and it seems to create extra amount of book-keeping.
But I might be able to live with that as a compromise.

I understand that sometimes it might be helpful to see those delta
patches to verify that change requests have been addressed, but
I at least am more confused by them than by updated (amended) patchsets.
I usually need to see the full patches more than I need the delta.
So when someone does that delta patch update, once I realize what's going
on, I pull the branch and view the patch in its full beauty of a combined
diff locally. ;-)

Hope that made it a little more clear what I was suggesting.


> ----- Original Message -----
> From: "Michael Adam" <madam at redhat.com>
> To:"Luis Pabón" <lpabon at redhat.com>
> Sent: Tuesday, September 20, 2016 4:50:01 AM
> Subject: [RFC] [upstream] How pushing to heketi happens - especially about
> squashing
> Hi all, hi Luis,
> Since we do not have a real upstream ML yet (see my other mail), I want
> use this list now for discussing about the way patches are merged into
> heketi upstream.
> [ tl;dr ? --> look for "summing up" at the bottom... ;-) ]
> This is after a few weeks of working on the projects with you all
> especially with Luis, and seeing how he does the project. And there
> have been a few surprises on both ends.
> While I still don't fully like or trust the github UI, it is
> for instance better than gerrit (But as José sais: "That bar
> is really low..." ;-). One point where it is better is it can
> deal with patchsets, i.e. multiple patches submitted as one PR.
> But github has the feature of instead of squashing the patches
> instead of merging the patches as they are. This can be useful
> or remotely correct in one situation, but I think generally it
> should be avoided for reasons detailed below.
> So in this mail, I am sharing a few observations from the past
> few weeks, and a few concerns or problems I am having. I think
> it is important with the growing team to clearly formulate
> how both reviewers and patch submitters expect the process to work.
> At least when I propose a patchset, I propose it exactly the way
> I send it. Coming from Samba and Gluster development, for me as a
> contributor and as a reviewer, the content of the commits, i.e.
> the actual diffs as well as the layout into patches and the commit
> messages are 'sacred' in the sense that this is what the patch
> submitter proposed and signed-off on for pushing. Hence the reviewer
> should imho not change the general layout of patches (e.g. by squashing
> them) without consulting the author. Here are two examples where
> pull request with two patches were squashed with the heketi method:
> https://github.com/heketi/heketi/commit/bbc513ef214c5ec81b6cdb0a3a024944c9fe12ba
> https://github.com/heketi/heketi/commit/bccab2ee8f70f6862d9bfee3a8cbdf6e47b5a8bf
> You see what github does: it prints the title of the PR as main commit
> message and creates a bullet list of the original commit messages.
> Hence, it really creates pretty bad commits (A commit called
> "Two minor patches (#499)" - really??)... :-)
> This is not how these were intended by the authors. The actual result of
> how the commits looks like in git after they have been merged.
> (Btw, I don't look at the git log / code in github: it is difficult to see
> the relevant things there. I look at it in a local git checkout in the shell.
> This is the "everlasting", "sacred" content.)
> So this tells me that:
> 1) Patches should not be squashed without consulting the author,
>    especially if they are not flagged with "SQ" or "SQUASH" in the
>    commit message or so. If they are not flagged like this, the
>    author did not want them to be squashed. She did not sign-off
>    on any changes patch arrangement/commit message.
> 2) Squashing should not be done with the github ui, but rather manually
>    in a local checkout, so that one has full control over the resulting
>    commit message. Also in a patchset it may be good to squash some
>    patches but not all.
> 3) Squashing should ideally been done by the patch author who would then
>    update the PR with his updated patchet (force-pushing) to present
>    a merge-able iteration of the patchset, not a delta. The reviewer
>    could do it on behalf of the author, but only if agreed with the author
>    and thn (see #2) in a local checkout and not with the github ui, imho.
> The last part mentions a point where squashing may be legitimate, and this
> has been a point of confusion of how we have been using the github PR system:
> When something is to be changed with the patche in a PR, there two general
> approaches:
> a) Create a new patch (or new patches) on top of the original branch and
>    push that over your github branch. It will update the PR with the push
>    and the pr will show these delta patches.
> b) Amend the patchset locally (with git commit --amend and/ or git rebase -i)
>    and re-push over the github branch with git push --force. This will also
>    update the PR but it will present the complete new patchest instead of
>    just deltas.
> It seems that Luis has kind of been expecting the #a workflow but us
> samba/gluster
> people have been using #b. I understand that it can sometimes be convenient
> to see the delta between two iterations, but it requires the reviewer to
> squash the patches, end hence this should imho never be the final step in
> the review process. I can imagine a PR going through a few iterations of #a
> and when everyone is content, the author could then work on squashing the
> patches and pushing the final result of the round of reviews as in #b for
> final review. (Note: if needed patchset deltas can also be looked at in a
> local checkout using the reflog to see older branch heads.)
> Examples of this confusion and glimpses of this discussion in github PRs
> are these:
> * https://github.com/heketi/heketi/pull/454
>   Here the original patch had no commit msg, but a later one was
>   force-pushed and had a commit msg.
> * https://github.com/heketi/heketi/pull/472
>   There was the first real discussion about this topic, see:
>   https://github.com/heketi/heketi/pull/472#issuecomment-247140585
>   https://github.com/heketi/heketi/pull/472#issuecomment-247141100
>   https://github.com/heketi/heketi/pull/472#issuecomment-247143674
> etc.
> So summing up, for all those reasons above I request:
> 1) The the review should not squash the authors commits unless
>    the author explicitly requests or approves that.
> 2) We should avoid using github to merge because this creates
>    bad commit messages.
> 3) (As a consequence of the above,) If we push delta-patches
>    to update PRs, that can usually not be the final push, but
>    needs a final iteration of force-pushing an amended patchset.
> This may create a bit more work for the review process in cases,
> but with a growing team and  group of contributors, I think it
> is important to more concretely describe our process to avoid
> confusion.
> Thoughts?
> Michael

More information about the Gluster-devel mailing list