[Gluster-devel] [Gluster-users] Duplicate UUID entries in "gluster peer status" command

Mon Nov 21 12:50:23 UTC 2016

Abhishek,

Here is the plan of action I suggest:

1. Set up a fresh cluster (say board A & board B) with 0 content in
/var/lib/glusterd and glusterd running with debug log mode (glusterd
-LDEBUG)
2. restart glusterd on board A (again with debug log mode)
3. check if you end up having multiple entries in gluster peer status
output, if so share both the glusterd logs along with content of
/var/lib/glusterd from both the nodes.

~Atin

On Mon, Nov 21, 2016 at 3:40 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com>
wrote:

> I have another set of logs for this problem and in this we don't have time
> stamp problem for this but we are getting same UUID duplicate entries on
> BoardB
>
> ls -lart
> total 16
> drwxrwxr-x 12 abhishek abhishek 4096 Oct  3 08:58 ..
> drwxrwxr-x  2 abhishek abhishek 4096 Oct  3 08:58 .
> -rw-rw-r--  1 abhishek abhishek   71 Oct  3 13:36 d8f66ce8-4154-4246-8084-
> c63b9cfc1af4
> -rw-rw-r--  1 abhishek abhishek   71 Oct  3 13:36 28d37300-425a-4781-b1e0-
> c13efa2ceee6
>
> Previously also I provided logs to you. Again I am attaching logs for you
> to analyze.
>
> I just want to know why this another peer file is getting created when
> there is not entry in logs for it.
>
> Regards,
> Abhishek
>
>
>
>
> On Mon, Nov 21, 2016 at 2:47 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com
> > wrote:
>
>>
>>
>> On Mon, Nov 21, 2016 at 2:28 PM, Atin Mukherjee <amukherj at redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, Nov 21, 2016 at 10:00 AM, ABHISHEK PALIWAL <
>>> abhishpaliwal at gmail.com> wrote:
>>>
>>>> Hi Atin,
>>>>
>>>> System is the embedded system and these dates are before the system get
>>>> in timer sync.
>>>>
>>>> Yes, I have also seen these two files in peers directory on 002500
>>>> board and I want to know the reason why gluster creates the second file
>>>> when there is old file is exist. Even when you see the content of the these
>>>> file are same.
>>>>
>>>> Is it possible for gluster if we fall in this situation then instead of
>>>> manually doing the steps which you mentioned above gluster will take care
>>>> of this?
>>>>
>>>
>>> We shouldn't have any unwanted data in /var/lib/glusterd at first place
>>> and that's a prerequisite of gluster installation failing which
>>> inconsistencies of configuration data can't be handled automatically until
>>> manual intervention.
>>>
>>>
>> it means before starting of gluster installation /var/lib/glusterd always
>> we empty? because in this case nothing is unwanted before installing the
>> glusterd.
>>
>>>
>>>> I have some questions:
>>>>
>>>> 1. based on the logs can we find out the reason for having two peers
>>>> files with same contents.
>>>>
>>>
>>> No we can't as the log file doesn't have any entry of
>>> 26ae19a6-b58f-446a-b079-411d4ee57450 which indicates that this entry is
>>> a stale one and was (is) there since long time and the log files are the
>>> latest.
>>>
>>
>> I agreed this 26ae19a6-b58f-446a-b079-411d4ee57450 entry is not there
>> but as we checked this file is newer in peer and
>> 5be8603b-18d0-4333-8590-38f918a22857 is the older file
>>
>> *.*
>> Also, below are some more logs in etc-glusterfs-glusterd.log file from
>> 002500 board file
>>
>> The message "I [MSGID: 106004] [glusterd-handler.c:5065:__glusterd_peer_rpc_notify]
>> 0-management: Peer <10.32.0.48> (<5be8603b-18d0-4333-8590-38f918a22857>),
>> in state <Peer in Cluster>, has disconnected from glusterd." repeated 3
>> times between [2016-11-17 22:01:23.542556] and [2016-11-17 22:01:36.993584]
>> The message "W [MSGID: 106118] [glusterd-handler.c:5087:__glusterd_peer_rpc_notify]
>> 0-management: Lock not released for c_glusterfs" repeated 3 times between
>> [2016-11-17 22:01:23.542973] and [2016-11-17 22:01:36.993855]
>> [2016-11-17 22:01:48.860555] I [MSGID: 106487]
>> [glusterd-handler.c:1411:__glusterd_handle_cli_list_friends] 0-glusterd:
>> Received cli list req
>> [2016-11-17 22:01:49.137733] I [MSGID: 106163]
>> [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack]
>> 0-management: using the op-version 30706
>> [2016-11-17 22:01:49.240986] I [MSGID: 106493]
>> [glusterd-rpc-ops.c:694:__glusterd_friend_update_cbk] 0-management:
>> Received ACC from uuid: 5be8603b-18d0-4333-8590-38f918a22857
>> [2016-11-17 22:11:58.658884] E [rpc-clnt.c:201:call_bail] 0-management:
>> bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x15 sent =
>> 2016-11-17 22:01:48.945424. timeout = 600 for 10.32.0.48:24007
>> [2016-11-17 22:11:58.658987] E [MSGID: 106153]
>> [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on
>> 10.32.0.48. Please check log file for details.
>> [2016-11-17 22:11:58.659243] I [socket.c:3382:socket_submit_reply]
>> 0-socket.management: not connected (priv->connected = 255)
>> [2016-11-17 22:11:58.659265] E [rpcsvc.c:1314:rpcsvc_submit_generic]
>> 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc
>> cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
>> [2016-11-17 22:11:58.659305] E [MSGID: 106430]
>> [glusterd-utils.c:400:glusterd_submit_reply] 0-glusterd: Reply
>> submission failed
>> [2016-11-17 22:13:58.674343] E [rpc-clnt.c:201:call_bail] 0-management:
>> bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x11 sent =
>> 2016-11-17 22:03:50.268751. timeout = 600 for 10.32.0.48:24007
>> [2016-11-17 22:13:58.674414] E [MSGID: 106153]
>> [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on
>> 10.32.0.48. Please check log file for details.
>> [2016-11-17 22:13:58.674604] I [socket.c:3382:socket_submit_reply]
>> 0-socket.management: not connected (priv->connected = 255)
>> [2016-11-17 22:13:58.674627] E [rpcsvc.c:1314:rpcsvc_submit_generic]
>> 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc
>> cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
>> [2016-11-17 22:13:58.674667] E [MSGID: 106430]
>> [glusterd-utils.c:400:glusterd_submit_reply] 0-glusterd: Reply
>> submission failed
>> [2016-11-17 22:15:58.687737] E [rpc-clnt.c:201:call_bail] 0-management:
>> bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x17 sent =
>> 2016-11-17 22:05:51.341614. timeout = 600 for 10.32.0.48:24007
>>
>> is these logs causing duplicate UUID or duplicate UUID causing this?
>>
>>>
>>> 2. is there any way to do it from gluster code.
>>>>
>>>
>>> Ditto as above.
>>>
>>>
>>>>
>>>> Regards,
>>>> Abhishek
>>>>
>>>> Regards,
>>>> Abhishek
>>>>
>>>> On Mon, Nov 21, 2016 at 9:52 AM, Atin Mukherjee <amukherj at redhat.com>
>>>> wrote:
>>>>
>>>>> atin at dhcp35-96:~/Downloads/gluster_users/abhishek_dup_uuid/d
>>>>> uplicate_uuid/glusterd_2500/peers$ ls -lrt
>>>>> total 8
>>>>> -rw-------. 1 atin wheel 71 *Jan  1  1970*
>>>>> 5be8603b-18d0-4333-8590-38f918a22857
>>>>> -rw-------. 1 atin wheel 71 Nov 18 03:31 26ae19a6-b58f-446a-b079-411d4e
>>>>> e57450
>>>>>
>>>>> In board 2500 look at the date of the file
>>>>> 5be8603b-18d0-4333-8590-38f918a22857 (marked in bold). Not sure how
>>>>> did you end up having this file in such time stamp. I am guessing this
>>>>> could be because of the set up been not cleaned properly at the time of
>>>>> re-installation.
>>>>>
>>>>> Here is the steps what I'd recommend for now:
>>>>>
>>>>> 1. rename 26ae19a6-b58f-446a-b079-411d4ee57450 to
>>>>> 5be8603b-18d0-4333-8590-38f918a22857, you should have only one entry
>>>>> in the peers folder in board 2500.
>>>>> 2. Bring down both glusterd instances
>>>>> 3. Bring back one by one
>>>>>
>>>>> And then restart glusterd to see if the issue persists.
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 21, 2016 at 9:34 AM, ABHISHEK PALIWAL <
>>>>> abhishpaliwal at gmail.com> wrote:
>>>>>
>>>>>> Hope you will see in the logs......
>>>>>>
>>>>>> On Mon, Nov 21, 2016 at 9:17 AM, ABHISHEK PALIWAL <
>>>>>> abhishpaliwal at gmail.com> wrote:
>>>>>>
>>>>>>> Hi Atin,
>>>>>>>
>>>>>>> It is not getting wipe off we have changed the configuration path
>>>>>>> from /var/lib/glusterd to /system/glusterd.
>>>>>>>
>>>>>>> So, they will remain as same as previous.
>>>>>>>
>>>>>>> On Mon, Nov 21, 2016 at 9:15 AM, Atin Mukherjee <amukherj at redhat.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Abhishek,
>>>>>>>>
>>>>>>>> rebooting the board does wipe of /var/lib/glusterd contents in your
>>>>>>>> set up right (as per my earlier conversation with you) ? In that case, how
>>>>>>>> are you ensuring that the same node gets back the older UUID? If you don't
>>>>>>>> then this is bound to happen.
>>>>>>>>
>>>>>>>> On Mon, Nov 21, 2016 at 9:11 AM, ABHISHEK PALIWAL <
>>>>>>>> abhishpaliwal at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Team,
>>>>>>>>>
>>>>>>>>> Please lookinto this problem as this is very widely seen problem
>>>>>>>>> in our system.
>>>>>>>>>
>>>>>>>>> We are having the setup of replicate volume setup with two brick
>>>>>>>>> but after restarting the second board I am getting the duplicate entry in
>>>>>>>>> "gluster peer status" command like below:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *# gluster peer status Number of Peers: 2  Hostname: 10.32.0.48
>>>>>>>>> Uuid: 5be8603b-18d0-4333-8590-38f918a22857 State: Peer in Cluster
>>>>>>>>> (Connected)  Hostname: 10.32.0.48 Uuid:
>>>>>>>>> 5be8603b-18d0-4333-8590-38f918a22857 State: Peer in Cluster (Connected) # *
>>>>>>>>>
>>>>>>>>> I am attaching all logs from both the boards and the command
>>>>>>>>> outputs as well.
>>>>>>>>>
>>>>>>>>> So could you please check what is the reason to get in this
>>>>>>>>> situation as it is very frequent in multiple case.
>>>>>>>>>
>>>>>>>>> Also, we are not replacing any board from setup just rebooting.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Abhishek Paliwal
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ~ Atin (atinm)
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>> Abhishek Paliwal
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Abhishek Paliwal
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ~ Atin (atinm)
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>> Abhishek Paliwal
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> ~ Atin (atinm)
>>>
>>
>>
>>
>> --
>>
>>
>>
>>
>> Regards
>> Abhishek Paliwal
>>
>
>
>
>
>

-- 

~ Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20161121/10d633da/attachment-0001.html>