[Gluster-users] Gluster 3.7.6 add new node state Peer Rejected (Connected)

Thu Feb 25 20:49:32 UTC 2016

On 02/26/2016 01:53 AM, Mohammed Rafi K C wrote:
>
>
> On 02/26/2016 01:32 AM, Steve Dainard wrote:
>> I haven't done anything more than peer thus far, so I'm a bit
>> confused as to how the volume info fits in, can you expand on this a bit?
>>
>> Failed commits? Is this split brain on the replica volumes? I don't
>> get any return from 'gluster volume heal <volname> info' on all the
>> replica volumes, but if I try a gluster volume heal <volname> full I
>> get: 'Launching heal operation to perform full self heal on volume
>> <volname> has been unsuccessful'.
>
> forget about this. it is not for metadata selfheal .
>
>>
>> I have 5 volumes total.
>>
>> 'Replica 3' volumes running on gluster01/02/03:
>> vm-storage
>> iso-storage
>> export-domain-storage
>> env-modules
>>
>> And one distributed only volume 'storage' info shown below:
>>
>> *From existing host gluster01/02:*
>> type=0
>> count=4
>> status=1
>> sub_count=0
>> stripe_count=1
>> replica_count=1
>> disperse_count=0
>> redundancy_count=0
>> version=25
>> transport-type=0
>> volume-id=26d355cb-c486-481f-ac16-e25390e73775
>> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
>> password=
>> op-version=3
>> client-op-version=3
>> quota-version=1
>> parent_volname=N/A
>> restored_from_snap=00000000-0000-0000-0000-000000000000
>> snap-max-hard-limit=256
>> features.quota-deem-statfs=on
>> features.inode-quota=on
>> diagnostics.brick-log-level=WARNING
>> features.quota=on
>> performance.readdir-ahead=on
>> performance.cache-size=1GB
>> performance.stat-prefetch=on
>> brick-0=10.0.231.50:-mnt-raid6-storage-storage
>> brick-1=10.0.231.51:-mnt-raid6-storage-storage
>> brick-2=10.0.231.52:-mnt-raid6-storage-storage
>> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>>
>> *From existing host gluster03/04:*
>> type=0
>> count=4
>> status=1
>> sub_count=0
>> stripe_count=1
>> replica_count=1
>> disperse_count=0
>> redundancy_count=0
>> version=25
>> transport-type=0
>> volume-id=26d355cb-c486-481f-ac16-e25390e73775
>> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
>> password=
>> op-version=3
>> client-op-version=3
>> quota-version=1
>> parent_volname=N/A
>> restored_from_snap=00000000-0000-0000-0000-000000000000
>> snap-max-hard-limit=256
>> features.quota-deem-statfs=on
>> features.inode-quota=on
>> performance.stat-prefetch=on
>> performance.cache-size=1GB
>> performance.readdir-ahead=on
>> features.quota=on
>> diagnostics.brick-log-level=WARNING
>> brick-0=10.0.231.50:-mnt-raid6-storage-storage
>> brick-1=10.0.231.51:-mnt-raid6-storage-storage
>> brick-2=10.0.231.52:-mnt-raid6-storage-storage
>> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>>
>> So far between gluster01/02 and gluster03/04 the configs are the
>> same, although the ordering is different for some of the features.
>>
>> On gluster05/06 the ordering is different again, and the
>> quota-version=0 instead of 1.
>
> This is why the peer shows as rejected. Can you check the op-version
> of all the glusterd including the one which is in reject state. you
> can find out the op-version here in  /var/lib/glusterd/glusterd.info

If all the op-version are same and 3.7.6, then to work-around the issue,
you can manually make it quota-version=1, and restarting the glusterd
will solve the problem, But I would strongly recommend you to figure out
the RCA. May be you can file a bug for this.

Rafi

>
> Rafi KC
>
>>
>> *From new hosts gluster05/gluster06:*
>> type=0
>> count=4
>> status=1
>> sub_count=0
>> stripe_count=1
>> replica_count=1
>> disperse_count=0
>> redundancy_count=0
>> version=25
>> transport-type=0
>> volume-id=26d355cb-c486-481f-ac16-e25390e73775
>> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
>> password=
>> op-version=3
>> client-op-version=3
>> quota-version=0
>> parent_volname=N/A
>> restored_from_snap=00000000-0000-0000-0000-000000000000
>> snap-max-hard-limit=256
>> performance.stat-prefetch=on
>> performance.cache-size=1GB
>> performance.readdir-ahead=on
>> features.quota=on
>> diagnostics.brick-log-level=WARNING
>> features.inode-quota=on
>> features.quota-deem-statfs=on
>> brick-0=10.0.231.50:-mnt-raid6-storage-storage
>> brick-1=10.0.231.51:-mnt-raid6-storage-storage
>> brick-2=10.0.231.52:-mnt-raid6-storage-storage
>> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>>
>> Also, I forgot to mention that when I initially peer'd the two new
>> hosts, glusterd crashed on gluster03 and had to be restarted (log
>> attached) but has been fine since.
>>
>> Thanks,
>> Steve
>>
>> On Thu, Feb 25, 2016 at 11:27 AM, Mohammed Rafi K C
>> <rkavunga at redhat.com <mailto:rkavunga at redhat.com>> wrote:
>>
>>
>>
>>     On 02/25/2016 11:45 PM, Steve Dainard wrote:
>>>     Hello,
>>>
>>>     I upgraded from 3.6.6 to 3.7.6 a couple weeks ago. I just peered
>>>     2 new nodes to a 4 node cluster and gluster peer status is:
>>>
>>>     # gluster peer status *<-- from node gluster01*
>>>     Number of Peers: 5
>>>
>>>     Hostname: 10.0.231.51
>>>     Uuid: b01de59a-4428-486b-af49-cb486ab44a07
>>>     State: Peer in Cluster (Connected)
>>>
>>>     Hostname: 10.0.231.52
>>>     Uuid: 75143760-52a3-4583-82bb-a9920b283dac
>>>     State: Peer in Cluster (Connected)
>>>
>>>     Hostname: 10.0.231.53
>>>     Uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411
>>>     State: Peer in Cluster (Connected)
>>>
>>>     Hostname: 10.0.231.54 *<-- new node gluster05*
>>>     Uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c
>>>     *State: Peer Rejected (Connected)*
>>>
>>>     Hostname: 10.0.231.55 *<-- new node gluster06*
>>>     Uuid: 9c155c8e-2cd1-4cfc-83af-47129b582fd3
>>>     *State: Peer Rejected (Connected)*
>>
>>     Looks like your configuration files are mismatching, ie the
>>     checksum calculation differs on this two node than the others,
>>
>>     Did you had any failed commit ?
>>
>>     Compare your /var/lib/glusterd/<volname>/info of the failed node
>>     against good one, mostly you could see some difference.
>>
>>     can you paste the /var/lib/glusterd/<volname>/info ?
>>
>>     Regards
>>     Rafi KC
>>
>>
>>>     *
>>>     *
>>>     I followed the write-up
>>>     here: http://www.gluster.org/community/documentation/index.php/Resolving_Peer_Rejected
>>>     and the two new nodes peer'd properly but after a reboot of the
>>>     two new nodes I'm seeing the same Peer Rejected (Connected) State.
>>>
>>>     I've attached logs from an existing node, and the two new nodes.
>>>
>>>     Thanks for any suggestions,
>>>     Steve
>>>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     Gluster-users mailing list
>>>     Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>     http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160226/3c764e75/attachment.html>