[Gluster-users] Fixing a rejected peer
jlawrence at squaretrade.com
Tue Mar 6 18:50:13 UTC 2018
> On Mar 5, 2018, at 6:41 PM, Atin Mukherjee <amukherj at redhat.com> wrote:
> On Tue, Mar 6, 2018 at 6:00 AM, Jamie Lawrence <jlawrence at squaretrade.com> wrote:
> So I'm seeing a rejected peer with 3.12.6. This is with a replica 3 volume.
> It actually began as the same problem with a different peer. I noticed with (call it) gluster-2, when I couldn't make a new volume. I compared /var/lib/glusterd between them, and found that somehow the options in one of the vols differed. (I suspect this was due to attempting to create the volume via the Ovirt GUI; suffice to say I'm not using it for things like this in the future.) So I stopped the daemons and corrected that (gluster-2 had a tiering entry the others didn't).
> When you say the others didn't how many peers are you talking about? Are they all running 3.12.6? We had a bug https://bugzilla.redhat.com/show_bug.cgi?id=1544637 which could lead you to such situations but that has been fixed in 3.12.6. So if all of the nodes are running with the same version i.e. 3.12.6 and the cluster.op-version is set to latest, then ideally you shouldn't see this problem. Could you clarify?
They all run 3.12.6, there are currently 3 peers total.
So, cluster.op-version is: 30800. I was previously unaware of the distinction, but in looking at the `info` file, the client op-version for volumes is 30712. Does that matter?
That bug does look like what happened, though.
> Started things back up and now gluster-3 is being rejected by the other two. The error is below.
> I'm tempted to repeat - down things, copy the checksum the "good" ones agree on, start things; but given that this has turned into a balloon-squeezing exercise, I want to make sure I'm not doing this the wrong way.
> Yes, that's the way. Copy /var/lib/glusterd/vols/<volname>/ from the good node to the rejected one and restart glusterd service on the rejected peer.
So I did this, and it immediately went back to rejected state. The `cksum` file immediately diverged.
More information about the Gluster-users