[Gluster-devel] [Heketi] How pushing to heketi happens - especially about squashing

Tue Sep 20 12:33:54 UTC 2016

Hi Michael,
  We have a new mailing list, it is gluster-devel with [Heketi] in the
subject.  I probably will add this to the communications wiki page.

On the concept of github. It is always interesting to compare what we
know and how we are used to something to something new. In Github,
we do not need to let Github squash at all.  I was doing that as a 'pilot'.
The real method is for patches to be added to a PR, and if too many
patches are added, for the author to squash them, and send a new one.
This is documented in the Development Guide in the Wiki.

The author should also note that their first patch/commit sent as
a PR is the information used as the PR.  Lots of PRs are being sent
with almost no information, and I have let this happen because most
people are still ramping up.

There is no reason why commit messages cannot be as detailed as
those from Gerrit.  Here is an example: https://github.com/heketi/heketi/pull/393 .

The process to update changes is to update the forked
branch, and not to amend the same change.  Amending makes it impossible
to determine the changes from patch to patch, and makes it extremely hard
on reviewers (me).

Here are my thoughts on your questions below:

1) The the review should not squash the authors commits unless
   the author explicitly requests or approves that.
[lpabon] Absolutely.  The pilot, although it worked well technically,
it confuses those who come from other source control systems.

2) We should avoid using github to merge because this creates
   bad commit messages.
[lpabon] I'm not sure what you mean by this, but I would not
"avoid" github in any way.  That is like saying "avoid Gerrit".

3) (As a consequence of the above,) If we push delta-patches
   to update PRs, that can usually not be the final push, but
   needs a final iteration of force-pushing an amended patchset.
[lpabon] Do not amend patches.

NOTE on amended patches.  If I notice another one, I will *not* merge
the change.  Sorry to be a pain about that, but it makes it almost
impossible to review.  This is not Gerrit, this is Github, it
is something new, but in my opinion, it is more natural git workflow.

- Luis

----- Original Message -----
From: "Michael Adam" <madam at redhat.com>
To:"Luis Pabón" <lpabon at redhat.com>
Sent: Tuesday, September 20, 2016 4:50:01 AM
Subject: [RFC] [upstream] How pushing to heketi happens - especially about squashing

Hi all, hi Luis,

Since we do not have a real upstream ML yet (see my other mail), I want
use this list now for discussing about the way patches are merged into
heketi upstream.

[ tl;dr ? --> look for "summing up" at the bottom... ;-) ]

This is after a few weeks of working on the projects with you all
especially with Luis, and seeing how he does the project. And there
have been a few surprises on both ends.

While I still don't fully like or trust the github UI, it is
for instance better than gerrit (But as José sais: "That bar
is really low..." ;-). One point where it is better is it can
deal with patchsets, i.e. multiple patches submitted as one PR.

But github has the feature of instead of squashing the patches
instead of merging the patches as they are. This can be useful
or remotely correct in one situation, but I think generally it
should be avoided for reasons detailed below.

So in this mail, I am sharing a few observations from the past
few weeks, and a few concerns or problems I am having. I think
it is important with the growing team to clearly formulate
how both reviewers and patch submitters expect the process to work.

At least when I propose a patchset, I propose it exactly the way
I send it. Coming from Samba and Gluster development, for me as a
contributor and as a reviewer, the content of the commits, i.e.
the actual diffs as well as the layout into patches and the commit
messages are 'sacred' in the sense that this is what the patch
submitter proposed and signed-off on for pushing. Hence the reviewer
should imho not change the general layout of patches (e.g. by squashing
them) without consulting the author. Here are two examples where
pull request with two patches were squashed with the heketi method:

https://github.com/heketi/heketi/commit/bbc513ef214c5ec81b6cdb0a3a024944c9fe12ba
https://github.com/heketi/heketi/commit/bccab2ee8f70f6862d9bfee3a8cbdf6e47b5a8bf

You see what github does: it prints the title of the PR as main commit
message and creates a bullet list of the original commit messages.
Hence, it really creates pretty bad commits (A commit called
"Two minor patches (#499)" - really??)... :-)
This is not how these were intended by the authors. The actual result of
how the commits looks like in git after they have been merged.
(Btw, I don't look at the git log / code in github: it is difficult to see
the relevant things there. I look at it in a local git checkout in the shell.
This is the "everlasting", "sacred" content.)

So this tells me that:

1) Patches should not be squashed without consulting the author,
   especially if they are not flagged with "SQ" or "SQUASH" in the
   commit message or so. If they are not flagged like this, the
   author did not want them to be squashed. She did not sign-off
   on any changes patch arrangement/commit message.

2) Squashing should not be done with the github ui, but rather manually
   in a local checkout, so that one has full control over the resulting
   commit message. Also in a patchset it may be good to squash some
   patches but not all.

3) Squashing should ideally been done by the patch author who would then
   update the PR with his updated patchet (force-pushing) to present
   a merge-able iteration of the patchset, not a delta. The reviewer
   could do it on behalf of the author, but only if agreed with the author
   and thn (see #2) in a local checkout and not with the github ui, imho.

The last part mentions a point where squashing may be legitimate, and this
has been a point of confusion of how we have been using the github PR system:

When something is to be changed with the patche in a PR, there two general
approaches:

a) Create a new patch (or new patches) on top of the original branch and
   push that over your github branch. It will update the PR with the push
   and the pr will show these delta patches.

b) Amend the patchset locally (with git commit --amend and/ or git rebase -i)
   and re-push over the github branch with git push --force. This will also
   update the PR but it will present the complete new patchest instead of
   just deltas.

It seems that Luis has kind of been expecting the #a workflow but us samba/gluster
people have been using #b. I understand that it can sometimes be convenient
to see the delta between two iterations, but it requires the reviewer to
squash the patches, end hence this should imho never be the final step in
the review process. I can imagine a PR going through a few iterations of #a
and when everyone is content, the author could then work on squashing the
patches and pushing the final result of the round of reviews as in #b for
final review. (Note: if needed patchset deltas can also be looked at in a
local checkout using the reflog to see older branch heads.)

Examples of this confusion and glimpses of this discussion in github PRs
are these:

* https://github.com/heketi/heketi/pull/454

  Here the original patch had no commit msg, but a later one was
  force-pushed and had a commit msg.

* https://github.com/heketi/heketi/pull/472

  There was the first real discussion about this topic, see:

  https://github.com/heketi/heketi/pull/472#issuecomment-247140585
  https://github.com/heketi/heketi/pull/472#issuecomment-247141100
  https://github.com/heketi/heketi/pull/472#issuecomment-247143674

etc. 

So summing up, for all those reasons above I request:

1) The the review should not squash the authors commits unless
   the author explicitly requests or approves that.

2) We should avoid using github to merge because this creates
   bad commit messages.

3) (As a consequence of the above,) If we push delta-patches
   to update PRs, that can usually not be the final push, but
   needs a final iteration of force-pushing an amended patchset.

This may create a bit more work for the review process in cases,
but with a growing team and  group of contributors, I think it
is important to more concretely describe our process to avoid
confusion.

Thoughts?

Michael