[Gluster-users] problem with Peer Rejected

Fri Feb 4 15:39:08 UTC 2022

well, I tried to downgrade to 8.6 on node gluster07. It didn't help.

Fortunately I remember old post from Strahil in ovirt list which 
suggests to switch

gluster volume set  <VOLUME NAME> cluster.lookup-optimize off

when expanding cluster. As nodes were rejected due cksum mismatch on 
only one volume I tried to switch lookup-optimize on that volume, then 
restart glusterd on both nodes and it helped.

Problem seems to be solved...

Cheers,

Jiri

On 2/4/22 15:45, Jiří Sléžka wrote:
> Hello,
> 
> I have a glusterfs cluster in version 8.6, 6 nodes, 1 arbiter node, 
> distributed-replicated setup with arbiter (Number of Bricks: 3 x (2 + 1) 
> = 9).
> 
> Yesterday I added two new nodes. Because I plan to upgrade to gluster 9 
> I have installed them with Rocky Linux 8 and glusterfs 9 (from CentOS 
> stream repo). Then I added these two nodes and got this setup
> 
> Volume Name: samba
> Type: Distributed-Replicate
> Volume ID: a96ea622-7abb-4213-a39b-8a23a3035a5d
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 4 x (2 + 1) = 12
> Transport-type: tcp
> Bricks:
> Brick1: 10.10.102.91:/gluster/samba
> Brick2: 10.10.100.92:/gluster/samba
> Brick3: 10.10.100.90:/gluster/samba/brick1 (arbiter)
> Brick4: 10.10.100.93:/gluster/samba
> Brick5: 10.10.100.94:/gluster/samba
> Brick6: 10.10.100.90:/gluster/samba/brick2 (arbiter)
> Brick7: 10.10.100.95:/gluster/samba
> Brick8: 10.10.100.96:/gluster/samba
> Brick9: 10.10.100.90:/gluster/samba/brick3 (arbiter)
> Brick10: 10.10.100.97:/gluster/samba
> Brick11: 10.10.100.98:/gluster/samba
> Brick12: 10.10.100.90:/gluster/samba/brick4 (arbiter)
> Options Reconfigured:
> auth.allow: xxxxxxxxxxxxxxxxxxxxxxx
> cluster.self-heal-daemon: on
> cluster.entry-self-heal: on
> cluster.metadata-self-heal: on
> cluster.data-self-heal: on
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> performance.readdir-ahead: on
> features.shard: on
> features.shard-block-size: 512MB
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.lookup-optimize: off
> 
> 
> op-version is still 80000
> 
> It worked well, I reballanced one of volumes but today I mentioned that 
> two new nodes are in Peer Rejected state (from gluster02 view)
> 
> gluster peer status
> Number of Peers: 8
> 
> Hostname: 10.10.100.91
> Uuid: 6d9e6170-2386-4b40-8fb5-7aeaef3d3122
> State: Peer in Cluster (Connected)
> 
> Hostname: 10.224.102.93
> Uuid: 4f74741e-7fee-41d0-a8db-916458f7280e
> State: Peer in Cluster (Connected)
> 
> Hostname: 10.10.100.94
> Uuid: cda31067-5bd9-44ea-816d-7c9dd947d78a
> State: Peer in Cluster (Connected)
> 
> Hostname: 10.10.100.95
> Uuid: 3c904f48-1ff3-4669-891b-27d4296ccf0e
> State: Peer in Cluster (Connected)
> 
> Hostname: 10.10.100.96
> Uuid: 0105494d-d5b4-40fb-ad31-c531efd818bb
> State: Peer in Cluster (Connected)
> 
> Hostname: 10.10.100.90
> Uuid: 291b7afd-3090-4733-a97f-20f8585adad2
> State: Peer in Cluster (Connected)
> 
> Hostname: 10.10.100.97
> Uuid: 82ac9abf-1678-43c9-a92f-94d0d472b2fe
> State: Peer Rejected (Disconnected)
> 
> Hostname: 10.10.100.98
> Uuid: 0f9e4891-250a-45b5-bdd3-e6a61aa49a29
> State: Peer Rejected (Connected)
> 
> from new node (gluster08) are Peer Rejected all nodes
> 
> there are log line in /var/log/glusterfs/glusterd.log like this
> 
> [2022-02-04 14:36:49.805753 +0000] E [MSGID: 106010] 
> [glusterd-utils.c:3851:glusterd_compare_friend_volume] 0-management: 
> Version of Cksums samba differ. local cksum = 3146523269, remote cksum = 
> 2206743689 on peer 10.10.100.97
> 
> there is a documentation for this particular problem...
> 
> https://docs.gluster.org/en/latest/Troubleshooting/troubleshooting-glusterd/#common-issues-and-how-to-resolve-them 
> 
> 
> ..but
> 
> gluster volume get all cluster.max-op-version
> 
> is still 80000
> 
> and I cannot set it lower or equal
> 
> gluster volume set all cluster.op-version 80000
> volume set: failed: Required op-version (80000) should not be equal or 
> lower than current cluster op-version (80000).
> 
> Unfortunately cluster seems broken on client's side. Any hints how can I 
> recover?
> 
> Thanks in advance,
> 
> Jiri
> 
> 
> ________
> 
> 
> 
> Community Meeting Calendar:
> 
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4269 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220204/8af1ce5e/attachment.p7s>