<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Dec 1, 2017 at 1:55 AM, Ziemowit Pierzycki <span dir="ltr">&lt;<a href="mailto:ziemowit@pierzycki.com" target="_blank">ziemowit@pierzycki.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
I have a problem joining four Gluster 3.10 nodes to an existing<br>
Gluster 3.8 nodes.  My understanding that this should work and not be<br>
too much of a problem.<br>
<br>
Peer robe is successful but the node is rejected:<br>
<br>
gluster&gt; peer detach elkpinfglt07<br>
peer detach: success<br>
gluster&gt; peer probe elkpinfglt07<br>
peer probe: success.<br>
gluster&gt; peer status<br>
Number of Peers: 6<br>
<br>
Hostname: elkpinfglt02<br>
Uuid: 926e9b8a-94ff-4924-b133-<wbr>a30f2dd48054<br>
State: Peer in Cluster (Connected)<br>
<br>
Hostname: elkpinfglt03<br>
Uuid: 34d1a409-acc8-41f6-9b11-<wbr>938317ad3421<br>
State: Peer in Cluster (Connected)<br>
<br>
Hostname: elkpinfglt04<br>
Uuid: 93255842-e190-4e67-ae8b-<wbr>917583917855<br>
State: Peer in Cluster (Connected)<br>
<br>
Hostname: elkpinfglt05<br>
Uuid: 263f8d43-d83e-4465-9de3-<wbr>e6a285072b02<br>
State: Peer in Cluster (Connected)<br>
<br>
Hostname: elkpinfglt06<br>
Uuid: aeaa998a-e8e7-405e-bf21-<wbr>f25de8d82c25<br>
State: Peer in Cluster (Connected)<br>
<br>
Hostname: elkpinfglt07<br>
Uuid: 4baff5cf-6e81-4b2e-b31f-<wbr>be725b2da4b3<br>
State: Peer Rejected (Connected)<br>
<br>
The node where I&#39;m probing from complains about not able to find<br>
information on elkpinfglt07 but then it&#39;s found anyway and checksums<br>
on data0 volume aren&#39;t the same:<br>
<br>
[2017-11-30 20:12:24.278996] I [MSGID: 106487]<br>
[glusterd-handler.c:1241:__<wbr>glusterd_handle_cli_probe] 0-glusterd:<br>
Received CLI probe req elkpinfglt07 24007<br>
[2017-11-30 20:12:24.279999] I [MSGID: 106129]<br>
[glusterd-handler.c:3670:<wbr>glusterd_probe_begin] 0-glusterd: Unable to<br>
find peerinfo for host: elkpinfglt07 (24007)<br>
[2017-11-30 20:12:24.281020] I<br>
[rpc-clnt.c:1046:rpc_clnt_<wbr>connection_init] 0-management: setting<br>
frame-timeout to 600<br>
[2017-11-30 20:12:24.288605] I [MSGID: 106498]<br>
[glusterd-handler.c:3598:<wbr>glusterd_friend_add] 0-management: connect<br>
returned 0<br>
[2017-11-30 20:12:24.301962] I [MSGID: 106511]<br>
[glusterd-rpc-ops.c:252:__<wbr>glusterd_probe_cbk] 0-management: Received<br>
probe resp from uuid: 4baff5cf-6e81-4b2e-b31f-<wbr>be725b2da4b3, host:<br>
elkpinfglt07<br>
[2017-11-30 20:12:24.301989] I [MSGID: 106511]<br>
[glusterd-rpc-ops.c:412:__<wbr>glusterd_probe_cbk] 0-glusterd: Received<br>
resp to probe req<br>
[2017-11-30 20:12:25.425294] I [MSGID: 106493]<br>
[glusterd-rpc-ops.c:476:__<wbr>glusterd_friend_add_cbk] 0-glusterd:<br>
Received ACC from uuid: 4baff5cf-6e81-4b2e-b31f-<wbr>be725b2da4b3, host:<br>
elkpinfglt07, port: 0<br>
[2017-11-30 20:12:25.429679] I [MSGID: 106163]<br>
[glusterd-handshake.c:1271:__<wbr>glusterd_mgmt_hndsk_versions_<wbr>ack]<br>
0-management: using the op-version 30800<br>
[2017-11-30 20:12:25.432426] I [MSGID: 106490]<br>
[glusterd-handler.c:2954:__<wbr>glusterd_handle_probe_query] 0-glusterd:<br>
Received probe from uuid: 4baff5cf-6e81-4b2e-b31f-<wbr>be725b2da4b3<br>
[2017-11-30 20:12:25.432490] I [MSGID: 106493]<br>
[glusterd-handler.c:3017:__<wbr>glusterd_handle_probe_query] 0-glusterd:<br>
Responded to elkpinfglt07, op_ret: 0, op_errno: 0, ret: 0<br>
[2017-11-30 20:12:25.436435] I [MSGID: 106490]<br>
[glusterd-handler.c:2608:__<wbr>glusterd_handle_incoming_<wbr>friend_req]<br>
0-glusterd: Received probe from uuid:<br>
4baff5cf-6e81-4b2e-b31f-<wbr>be725b2da4b3<br>
[2017-11-30 20:12:25.436683] E [MSGID: 106010]<br>
[glusterd-utils.c:2938:<wbr>glusterd_compare_friend_<wbr>volume] 0-management:<br>
Version of Cksums data0 differ. local cksum = 3011020419, remote cksum<br>
= 729330920 on peer elkpinfglt07<br>
[2017-11-30 20:12:25.436716] I [MSGID: 106493]<br>
[glusterd-handler.c:3852:<wbr>glusterd_xfer_friend_add_resp] 0-glusterd:<br>
Responded to elkpinfglt07 (0), ret: 0, op_ret: -1<br>
[2017-11-30 20:12:31.494646] I [MSGID: 106487]<br>
[glusterd-handler.c:1474:__<wbr>glusterd_handle_cli_list_<wbr>friends]<br>
0-glusterd: Received cli list req<br>
[2017-11-30 20:14:06.174548] I [MSGID: 106487]<br>
[glusterd-handler.c:1474:__<wbr>glusterd_handle_cli_list_<wbr>friends]<br>
0-glusterd: Received cli list req<br>
[2017-11-30 20:14:21.518765] I [MSGID: 106487]<br>
[glusterd-handler.c:1474:__<wbr>glusterd_handle_cli_list_<wbr>friends]<br>
0-glusterd: Received cli list req<br>
<br>
On the new node the log shows this:<br>
<br>
[2017-11-30 20:12:25.196229] I [MSGID: 106163]<br>
[glusterd-handshake.c:1316:__<wbr>glusterd_mgmt_hndsk_versions_<wbr>ack]<br>
0-management: using the op-version 30800<br>
[2017-11-30 20:12:25.198228] I [MSGID: 106490]<br>
[glusterd-handler.c:2957:__<wbr>glusterd_handle_probe_query] 0-glusterd:<br>
Received probe from uuid: f614c686-52c9-4d2c-92e2-<wbr>7ea6cdcfba61<br>
[2017-11-30 20:12:25.198447] I [MSGID: 106129]<br>
[glusterd-handler.c:2992:__<wbr>glusterd_handle_probe_query] 0-glusterd:<br>
Unable to find peerinfo for host: elkpinfglt01 (24007)<br>
[2017-11-30 20:12:25.200587] W [MSGID: 106062]<br>
[glusterd-handler.c:3466:<wbr>glusterd_transport_inet_<wbr>options_build]<br>
0-glusterd: Failed to get tcp-user-timeout<br>
[2017-11-30 20:12:25.200649] I<br>
[rpc-clnt.c:1059:rpc_clnt_<wbr>connection_init] 0-management: setting<br>
frame-timeout to 600<br>
[2017-11-30 20:12:25.208147] I [MSGID: 106498]<br>
[glusterd-handler.c:3616:<wbr>glusterd_friend_add] 0-management: connect<br>
returned 0<br>
[2017-11-30 20:12:25.208318] I [MSGID: 106493]<br>
[glusterd-handler.c:3020:__<wbr>glusterd_handle_probe_query] 0-glusterd:<br>
Responded to elkpinfglt01, op_ret: 0, op_errno: 0, ret: 0<br>
[2017-11-30 20:12:25.209824] I [MSGID: 106490]<br>
[glusterd-handler.c:2606:__<wbr>glusterd_handle_incoming_<wbr>friend_req]<br>
0-glusterd: Received probe from uuid:<br>
f614c686-52c9-4d2c-92e2-<wbr>7ea6cdcfba61<br>
[2017-11-30 20:12:25.325953] I<br>
[rpc-clnt.c:1059:rpc_clnt_<wbr>connection_init] 0-nfs: setting<br>
frame-timeout to 600<br>
[2017-11-30 20:12:25.326055] I [MSGID: 106132]<br>
[glusterd-proc-mgmt.c:83:<wbr>glusterd_proc_stop] 0-management: nfs already<br>
stopped<br>
[2017-11-30 20:12:25.326069] I [MSGID: 106568]<br>
[glusterd-svc-mgmt.c:229:<wbr>glusterd_svc_stop] 0-management: nfs service<br>
is stopped<br>
[2017-11-30 20:12:25.327527] I [MSGID: 106132]<br>
[glusterd-proc-mgmt.c:83:<wbr>glusterd_proc_stop] 0-management: glustershd<br>
already stopped<br>
[2017-11-30 20:12:25.327540] I [MSGID: 106568]<br>
[glusterd-svc-mgmt.c:229:<wbr>glusterd_svc_stop] 0-management: glustershd<br>
service is stopped<br>
[2017-11-30 20:12:25.327559] I [MSGID: 106567]<br>
[glusterd-svc-mgmt.c:197:<wbr>glusterd_svc_start] 0-management: Starting<br>
glustershd service<br>
[2017-11-30 20:12:26.329457] I [MSGID: 106132]<br>
[glusterd-proc-mgmt.c:83:<wbr>glusterd_proc_stop] 0-management: quotad<br>
already stopped<br>
[2017-11-30 20:12:26.329558] I [MSGID: 106568]<br>
[glusterd-svc-mgmt.c:229:<wbr>glusterd_svc_stop] 0-management: quotad<br>
service is stopped<br>
[2017-11-30 20:12:26.329850] I [MSGID: 106132]<br>
[glusterd-proc-mgmt.c:83:<wbr>glusterd_proc_stop] 0-management: bitd<br>
already stopped<br>
[2017-11-30 20:12:26.329879] I [MSGID: 106568]<br>
[glusterd-svc-mgmt.c:229:<wbr>glusterd_svc_stop] 0-management: bitd service<br>
is stopped<br>
[2017-11-30 20:12:26.330202] I [MSGID: 106132]<br>
[glusterd-proc-mgmt.c:83:<wbr>glusterd_proc_stop] 0-management: scrub<br>
already stopped<br>
[2017-11-30 20:12:26.330240] I [MSGID: 106568]<br>
[glusterd-svc-mgmt.c:229:<wbr>glusterd_svc_stop] 0-management: scrub<br>
service is stopped<br>
[2017-11-30 20:12:26.331621] I [MSGID: 106493]<br>
[glusterd-handler.c:3866:<wbr>glusterd_xfer_friend_add_resp] 0-glusterd:<br>
Responded to elkpinfglt01 (0), ret: 0, op_ret: 0<br>
[2017-11-30 20:12:26.340265] I [MSGID: 106511]<br>
[glusterd-rpc-ops.c:261:__<wbr>glusterd_probe_cbk] 0-management: Received<br>
probe resp from uuid: f614c686-52c9-4d2c-92e2-<wbr>7ea6cdcfba61, host:<br>
elkpinfglt01<br>
[2017-11-30 20:12:26.340331] I [MSGID: 106511]<br>
[glusterd-rpc-ops.c:421:__<wbr>glusterd_probe_cbk] 0-glusterd: Received<br>
resp to probe req<br>
[2017-11-30 20:12:26.344327] I [MSGID: 106493]<br>
[glusterd-rpc-ops.c:485:__<wbr>glusterd_friend_add_cbk] 0-glusterd:<br>
Received RJT from uuid: f614c686-52c9-4d2c-92e2-<wbr>7ea6cdcfba61, host:<br>
elkpinfglt01, port: 0<br>
<br>
Would the checksums cause the peer to be rejected?<br></blockquote><div><br></div><div>Yes that&#39;s the cause and it means that there is a delta between the info file of the volume data0 between the node elkpinfglt07 &amp; the node from where you executed peer probe. Can you please find out the difference of /var/lib/glusterd/vols/data0/info file between these two nodes?</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
</blockquote></div><br></div></div>