[Gluster-users] Unable to upgrade nodes because of cksums mismatch

Mon Dec 27 13:03:38 UTC 2021

Hi Michael

I think you are hitting a similar issue like this one
https://github.com/gluster/glusterfs/issues/3066.
If so, the fix for the same is under review and could be available in the
next release.

--
Thanks and Regards,
*NiKHIL LADHA*

On Mon, Dec 27, 2021 at 6:25 PM Michael Böhm <dudleyperkins at gmail.com>
wrote:

> Hey guys,
>
> i have a problem upgrading our nodes from 8.3 to 10.0 - i just upgraded
> the first node and run into "the cksums mismatch" problem. On the upgraded
> v10 node the checksums for all volumes are different than on the other v8
> nodes. That leads to the node starting in a peer rejected state. I can only
> resolve this by following the actions supposed here:
>
> https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/
> (stopping glusterd, deleting /var/lib/glusterd/* (except glusterd.info),
> start glusterd, probe a v8 peer, restart glusterd again)
>
> The cluster seems healthy again, self-healing is started and everything
> looks fine - only the newly created cksums are still different than on the
> other nodes. That means this healthy state only lasts till i reboot the
> node - where it all begins from the start - the nodes comes up as peer
> rejected.
>
> Now i'v read about the problem here:
> https://github.com/gluster/glusterfs/issues/1332 (even though that
> describes the problem should only occur when upgrading from earlier than v7)
> or also here on the mailing list:
> https://lists.gluster.org/pipermail/gluster-users/2021-November/039679.html
> (i think i have the same problem, but unfortunately no solution given here)
>
> Solutions seem to require upgrading all nodes and the problem should be
> resolved when finally upgrading op.version - but i dont' think this
> approach can be done online, and there's not really a way for me to do this
> offline.
>
> Why is this happening now and not when i upgraded from pre7 to 7? All my
> nodes are 8.3 and op.version is 8000.
>
> One thing i might have done "wrong" - as i upgraded to v8 i didn't set
> "gluster volume set <volname> fips-mode-rchecksum on" on the volumes, i
> think i just overlooked it in the docs. I have this option only set on 2
> volumes i created after upgrading to v8. But even on those 2 the cksums
> differ, so i guess it wouldn' help alot if i set the option on all other
> volumes?
>
> I really don't know what to do now, i kinda understand the problem but
> don't know why this is happening on a overall v8 cluster. I can't take all
> 9 nodes down, upgrade all to v10 and rely on "it's all good" with the final
> upgrade of op.version.
>
> Can someone point me in a safe direction?
>
> Regards
>
> Mika
>
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211227/1f0f4412/attachment.html>