[Gluster-users] cannot add server back to cluster after reinstallation

riccardo.murri at gmail.com riccardo.murri at gmail.com
Wed Mar 27 09:39:19 UTC 2019


Hello,

a couple days ago, the OS disk of one of the server of a local GlusterFS
cluster suffered a bad crash, and I had to reinstall everything from
scratch.

However, when I restart the GlusterFS service on the server that has
been reinstalled, I see that it sends back a "RJT" response to other
servers of the cluster, which then list it as "State: Peer Rejected
(Connected)"; the reinstalled server instead shows "Number of peers: 0".
The DEBUG level log on the reinstalled machine shows these lines after
the peer probe from another server in the cluster:

    I [MSGID: 106490] [glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
    D [MSGID: 0] [glusterd-peer-utils.c:208:glusterd_peerinfo_find_by_uuid] 0-management: Friend with uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318, not found
    D [MSGID: 0] [glusterd-peer-utils.c:234:glusterd_peerinfo_find] 0-management: Unable to find peer by uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
    D [MSGID: 0] [glusterd-peer-utils.c:132:glusterd_peerinfo_find_by_hostname] 0-management: Unable to find friend: glusterfs-server-004
    D [MSGID: 0] [glusterd-peer-utils.c:246:glusterd_peerinfo_find] 0-management: Unable to find hostname: glusterfs-server-004
    I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to glusterfs-server-004 (24007), ret: 0, op_ret: -1

What can I do to re-add the reinstalled server into the cluster?  Is it
safe (= keeps data) to "peer detach" it and then "peer probe" again?

Additional info:

* The actual GlusterFS brick data was on a different disk and so is safe
  and mounted back in the original location.

* I copied back the `/etc/glusterfs/glusterd.vol` from the other servers
  in the cluster and restored the UUID into
  `/var/lib/glusterfs/glusterd.info`

* I have checked that `max.op-version` is the same on all servers of the
  cluster, including the reinstalled one.

* All servers run Ubuntu 16.04

Thanks for any suggestion!

Riccardo


More information about the Gluster-users mailing list