[Gluster-users] Help, peer probe seems to get stuck on large cluster.

Yiping Peng barius.cn at gmail.com
Tue Sep 1 03:40:38 UTC 2015


Even if I'm seeing disconnected nodes (also from already-in-pool nodes), my
volume is still intact and available. So I'm guessing that glusterd has few
to do with volume/brick service?
Am I safe to kill all glusterd on all servers and start this whole peer
probing process all over again?
If I do this, will the currently mounted volumes become unavailable?


2015-08-31 17:47 GMT+08:00 Yiping Peng <barius.cn at gmail.com>:

> The "Disconnected" state of nodes randomly changes, so I randomly picked a
> node and tailed last several lines
> of /var/log/glusterfs/etc-glusterfs-glusterd.vol.log (is it the right log
> file?).
>
> I can still access the cluster from servers already in pool, either
> reading or writing is fine.
>
> The log shows a log of "Failed to set keep-alive: Protocol not available":
>
> Thanks.
>
> [2015-08-31 09:38:25.586073] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:27.193523] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 8ed2d6cf-9758-4adf-8ed2-2d87f76491cf
> [2015-08-31 09:38:27.209085] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:27.370367] C
> [rpc-clnt-ping.c:161:rpc_clnt_ping_timer_expired] 0-management: server
> 10.88.153.23:24007 has not responded in the last 30 seconds,
> disconnecting.
> [2015-08-31 09:38:28.803311] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 05885701-9a7c-4d2a-b18a-b5d9de2ccd57
> [2015-08-31 09:38:28.818834] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> The message "I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: f7de5463-080d-4547-9601-0e9541dea928"
> repeated 4 times between [2015-08-31 09:36:30.776194] and [2015-08-31
> 09:38:06.162677]
> The message "I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 62eb172c-58ac-47c8-931e-05e5ad5a3133"
> repeated 4 times between [2015-08-31 09:36:32.404743] and [2015-08-31
> 09:38:07.779594]
> [2015-08-31 09:38:30.419141] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server62.yq01.local.net> (<3d354922-4bcd-4469-9e2e-559067882217>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:30.419188] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server52.yq01.local.net> (<6466759d-05eb-406e-9ede-a36dbf26c216>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:30.419299] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 62eb172c-58ac-47c8-931e-05e5ad5a3133
> [2015-08-31 09:38:30.434835] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:32.035177] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 4db788d9-d372-4f57-a0f4-ba11d480013d
> [2015-08-31 09:38:33.373803] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 69, Protocol not available
> [2015-08-31 09:38:33.373821] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:33.376719] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 70, Protocol not available
> [2015-08-31 09:38:33.376735] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:32.050834] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:33.651240] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 9a291ec2-8f75-47fa-b4f4-c3edc02e9ce8
> [2015-08-31 09:38:33.666825] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:35.267184] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server75.yq01.local.net> (<aeb43c67-1dd3-45e9-abbf-cc0037472724>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:35.267237] W [socket.c:642:__socket_rwv] 0-nfs: readv on
> /var/run/gluster/7abc6dc0317b0f84408f0bc69917073c.socket failed (Invalid
> argument)
> [2015-08-31 09:38:35.267253] I [MSGID: 106006]
> [glusterd-svc-mgmt.c:319:glusterd_svc_common_rpc_notify] 0-management: nfs
> has disconnected from glusterd.
> [2015-08-31 09:38:35.267352] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: df2686ca-e020-4593-97d8-bd50de4b2775
> [2015-08-31 09:38:35.282829] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:36.877526] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fb93d7b465b] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fb93d5801b7] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb93d5802ce] (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fb93d58039b]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fb93d58095f] )))))
> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
> at 2015-08-31 09:37:43.506542 (xid=0x1535)
> [2015-08-31 09:38:36.877553] E [MSGID: 106167]
> [glusterd-handshake.c:2078:__glusterd_peer_dump_version_cbk] 0-management:
> Error through RPC layer, retry again later
> [2015-08-31 09:38:36.877643] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fb93d7b465b] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fb93d5801b7] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb93d5802ce] (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fb93d58039b]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fb93d58095f] )))))
> 0-management: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at
> 2015-08-31 09:37:43.506554 (xid=0x1536)
> [2015-08-31 09:38:36.877659] W [rpc-clnt-ping.c:204:rpc_clnt_ping_cbk]
> 0-management: socket disconnected
> [2015-08-31 09:38:36.877676] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server6.yq01.local.net> (<eb491a24-3edd-494a-90c0-b4280bd6995e>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:36.877823] W
> [glusterd-locks.c:677:glusterd_mgmt_v3_unlock] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fb93d7b465b] (-->
> /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x551)[0x7fb93316a111]
> (-->
> /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x2f0)[0x7fb9330d0300]
> (-->
> /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7fb9330b3a50]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x7fb93d5809a3] )))))
> 0-management: Lock for vol speech0 not held
> [2015-08-31 09:38:36.877840] W [MSGID: 106118]
> [glusterd-handler.c:5073:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for speech0
> [2015-08-31 09:38:36.877889] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server48.yq01.local.net> (<372c820d-003e-4885-870c-547ca17f6770>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:36.878012] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: d903d2f1-458d-43ae-a057-3f4999d3123a
> [2015-08-31 09:38:36.893088] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:37.380052] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 12, Protocol not available
> [2015-08-31 09:38:37.380071] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:38.492491] W [socket.c:642:__socket_rwv]
> 0-socket.management: writev on 10.88.155.28:65379 failed (Broken pipe)
> [2015-08-31 09:38:38.492510] I [socket.c:2409:socket_event_handler]
> 0-transport: disconnecting now
> [2015-08-31 09:38:38.492565] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT 0 on socket 5, Protocol not available
> [2015-08-31 09:38:38.492576] W [socket.c:2673:socket_server_event_handler]
> 0-socket.management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:38.492669] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> worker09.yq01.local.net> (<c0f4eab2-9cdd-4ba8-a002-259456288fd3>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:38.492715] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server53.yq01.local.net> (<b1f15cce-36e4-4ef4-a22f-70bafb0bf8d3>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:38.492786] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 96aa9f85-f979-42a8-ac0a-1136384fbc14
> [2015-08-31 09:38:38.508078] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:39.383260] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 27, Protocol not available
> [2015-08-31 09:38:39.383280] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:40.108404] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 72e2074f-921d-45d6-9601-deee653075a9
> [2015-08-31 09:38:40.124073] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:41.386485] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 23, Protocol not available
> [2015-08-31 09:38:41.386506] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:41.389473] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 30, Protocol not available
> [2015-08-31 09:38:41.389486] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:41.733507] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: f1c1b3d9-326d-4730-b1b0-788690da2ce1
> [2015-08-31 09:38:41.749079] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:43.348570] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 455da276-9ef5-46ab-90f9-457a70432224
> [2015-08-31 09:38:43.364074] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:44.964456] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server43.yq01.local.net> (<76cb46d9-5669-47db-b264-68b55d4c37f0>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:44.964578] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 00d5caae-b647-4dae-8d3e-df1e7f08941f
> [2015-08-31 09:38:44.980073] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:45.392805] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 38, Protocol not available
> [2015-08-31 09:38:45.392825] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:46.393009] C
> [rpc-clnt-ping.c:161:rpc_clnt_ping_timer_expired] 0-management: server
> 10.88.155.15:24007 has not responded in the last 30 seconds,
> disconnecting.
> [2015-08-31 09:38:46.584515] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: e204bc20-9c4f-449c-9dfc-f6e54b96bf8c
> [2015-08-31 09:38:46.600079] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:47.396000] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 35, Protocol not available
> [2015-08-31 09:38:47.396019] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:48.198525] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 607e3f7a-65e6-423a-9226-5f763f9838e8
> [2015-08-31 09:38:48.214089] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:49.815541] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: e2322b18-2e5f-4c3c-8cc2-84b137fa7328
> [2015-08-31 09:38:49.831078] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:51.434550] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fb93d7b465b] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fb93d5801b7] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb93d5802ce] (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fb93d58039b]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fb93d58095f] )))))
> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
> at 2015-08-31 09:37:56.464514 (xid=0x1315)
> [2015-08-31 09:38:51.434579] E [MSGID: 106167]
> [glusterd-handshake.c:2078:__glusterd_peer_dump_version_cbk] 0-management:
> Error through RPC layer, retry again later
> [2015-08-31 09:38:51.434669] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fb93d7b465b] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fb93d5801b7] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb93d5802ce] (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fb93d58039b]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fb93d58095f] )))))
> 0-management: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at
> 2015-08-31 09:37:56.464526 (xid=0x1316)
> [2015-08-31 09:38:51.434685] W [rpc-clnt-ping.c:204:rpc_clnt_ping_cbk]
> 0-management: socket disconnected
> [2015-08-31 09:38:51.434704] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server42.yq01.local.net> (<0b24198f-dfad-4259-bc22-9f3736f53824>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:38:51.434850] W
> [glusterd-locks.c:677:glusterd_mgmt_v3_unlock] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fb93d7b465b] (-->
> /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x551)[0x7fb93316a111]
> (-->
> /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x2f0)[0x7fb9330d0300]
> (-->
> /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7fb9330b3a50]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x7fb93d5809a3] )))))
> 0-management: Lock for vol speech0 not held
> [2015-08-31 09:38:51.434867] W [MSGID: 106118]
> [glusterd-handler.c:5073:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for speech0
> [2015-08-31 09:38:51.434994] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 8ed2d6cf-9758-4adf-8ed2-2d87f76491cf
> [2015-08-31 09:38:51.450075] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:53.049543] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: f7de5463-080d-4547-9601-0e9541dea928
> [2015-08-31 09:38:53.065083] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:54.666534] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 05885701-9a7c-4d2a-b18a-b5d9de2ccd57
> [2015-08-31 09:38:54.682066] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:57.399884] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 45, Protocol not available
> [2015-08-31 09:38:57.399906] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:57.402816] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 69, Protocol not available
> [2015-08-31 09:38:57.402830] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
> [2015-08-31 09:38:56.301076] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:57.897551] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: 9a291ec2-8f75-47fa-b4f4-c3edc02e9ce8
> [2015-08-31 09:38:57.913072] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:38:59.513520] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: df2686ca-e020-4593-97d8-bd50de4b2775
> [2015-08-31 09:38:59.529073] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:39:01.129419] I [MSGID: 106004]
> [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer <
> server75.yq01.local.net> (<aeb43c67-1dd3-45e9-abbf-cc0037472724>), in
> state <Peer in Cluster>, has disconnected from glusterd.
> [2015-08-31 09:39:01.129469] W [socket.c:642:__socket_rwv] 0-nfs: readv on
> /var/run/gluster/7abc6dc0317b0f84408f0bc69917073c.socket failed (Invalid
> argument)
> [2015-08-31 09:39:01.129484] I [MSGID: 106006]
> [glusterd-svc-mgmt.c:319:glusterd_svc_common_rpc_notify] 0-management: nfs
> has disconnected from glusterd.
> [2015-08-31 09:39:01.129587] I [MSGID: 106492]
> [glusterd-handler.c:2706:__glusterd_handle_friend_update] 0-glusterd:
> Received friend update from uuid: d903d2f1-458d-43ae-a057-3f4999d3123a
> [2015-08-31 09:39:01.145074] I [MSGID: 106502]
> [glusterd-handler.c:2751:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
> [2015-08-31 09:39:01.406146] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT -1000 on socket 12, Protocol not available
> [2015-08-31 09:39:01.406168] E [socket.c:3019:socket_connect]
> 0-management: Failed to set keep-alive: Protocol not available
>
>
>
> 2015-08-31 16:54 GMT+08:00 Atin Mukherjee <amukherj at redhat.com>:
>
>>
>>
>> On 08/31/2015 01:10 PM, Yiping Peng wrote:
>> > Hi guys,
>> >
>> >
>> > I've been running GlusterFS for a couple of days and it's been nice and
>> > steady, except a minor problem: the peer probing on my relatively large
>> > cluster seems to stuck for a long time.
>> >
>> >
>> > Last time atinm told me in IRC (I was barius.2333 in IRC) that a
>> cluster as
>> > large as 50+ nodes might take a long time peer probing (o(n^2) time),
>> and
>> > now my cluster has expanded to 90+ nodes.
>> >
>> >
>> > The peer probing process was started 4 days ago, when my cluster had ~50
>> > nodes. I probed ~40 nodes using subprocess in bash at once, and the
>> > commands all successfully returned almost immediately (no time-outs).
>> >
>> >
>> > However the glusterd kept writing to /var/lib/glusterd/peers/ during the
>> > last 4 days, and all commands related to newly-added nodes, e.g.
>> add-brick,
>> > mount, will time-out and fail. Also, running “gluster peer status” on my
>> > nodes shows “Disconnected” nodes that varies over time.
>> Peer status should not shows node in disconnected state even if the peer
>> handshaking takes longer time, if it does then something is wrong. Could
>> you check which node is disconnected and what the glusterd log file on
>> that node indicates?
>> >
>> >
>> > What shall I do in such situation? Do I need to wait for the whole peer
>> > probing progress to complete, or can I simply kill the glusterd and
>> restart
>> > it?
>> >
>> >
>> > Regards,
>> >
>> > Yiping Peng
>> >
>> >
>> >
>> > _______________________________________________
>> > Gluster-users mailing list
>> > Gluster-users at gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150901/72d57d7a/attachment.html>


More information about the Gluster-users mailing list