[Bugs] [Bug 1362170] Disconnection Issues in glusterd

bugzilla at redhat.com bugzilla at redhat.com
Mon Aug 1 13:23:16 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1362170



--- Comment #2 from Atin Mukherjee <amukherj at redhat.com> ---
(In reply to Olia Kremmyda from comment #0)
> Description of problem:
> 
> We use v3.6.9 and we have a deployment with 2 storage nodes with bricks and
> a third one acting as quorum. We have noticed a weird behavior in storage
> nodes with bricks in every deployment:
> 
> glusterd logs are full of the following message:
> 
> [2016-07-15 04:08:01.558743] W [socket.c:620:__socket_rwv] 0-management:
> readv on /var/run/2d164c87498372d99c47b3b627dcd393.socket failed (Invalid
> argument)
> 
> In addition, the following message is reported periodically in glusterd logs:
> 
> The message "I [MSGID: 106006]
> [glusterd-handler.c:4290:__glusterd_nodesvc_rpc_notify] 0-management: nfs
> has disconnected from glusterd." repeated 39 times between [2016-07-15
> 04:06:04.513882] and [2016-07-15 04:08:01.558787]
> 
> We found the following bug fix, which is quite similar:
> 
> http://review.gluster.org/#/c/9269/
> 
> 
> In addition to messages mentioned above, many times we face functionality
> problems and storage is not operational. We have collected the following
> logs from glusterfsd:
> 
> [2016-07-18 07:48:36.800400] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt:
> Volume file changed
> [2016-07-18 07:48:36.974391] I [glusterfsd-mgmt.c:1504:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2016-07-18 07:48:36.982390] I [server.c:518:server_rpc_notify]
> 0-voldata-server: disconnecting connection from
> SN-0-1580-2016/07/18-07:48:34:958658-voldata-client-0-0-0
> [2016-07-18 07:48:36.982422] I [client_t.c:417:gf_client_unref]
> 0-voldata-server: Shutting down connection
> SN-0-1580-2016/07/18-07:48:34:958658-voldata-client-0-0-0
> [2016-07-18 07:48:38.015640] I [login.c:82:gf_auth] 0-auth/login: allowed
> user names: 7a00c0d0-e728-4c9a-92d7-819a10952886
> [2016-07-18 07:48:38.015662] I [server-handshake.c:585:server_setvolume]
> 0-voldata-server: accepted client from
> SN-0-1622-2016/07/18-07:48:37:983796-voldata-client-0-0-0 (version: 3.6.9)
> [2016-07-18 07:48:38.284775] I [server.c:518:server_rpc_notify]
> 0-voldata-server: disconnecting connection from
> SN-1-2297-2016/07/18-07:48:36:82735-voldata-client-0-0-0
> [2016-07-18 07:48:38.284823] I [client_t.c:417:gf_client_unref]
> 0-voldata-server: Shutting down connection
> SN-1-2297-2016/07/18-07:48:36:82735-voldata-client-0-0-0
> [2016-07-18 07:48:39.331257] I [login.c:82:gf_auth] 0-auth/login: allowed
> user names: 7a00c0d0-e728-4c9a-92d7-819a10952886
> [2016-07-18 07:48:39.331282] I [server-handshake.c:585:server_setvolume]
> 0-voldata-server: accepted client from
> SN-1-2350-2016/07/18-07:48:38:966005-voldata-client-0-0-0 (version: 3.6.9)
> [2016-07-18 07:48:41.357084] I [login.c:82:gf_auth] 0-auth/login: allowed
> user names: 7a00c0d0-e728-4c9a-92d7-819a10952886
> [2016-07-18 07:48:41.357109] I [server-handshake.c:585:server_setvolume]
> 0-voldata-server: accepted client from
> SN-2-2765-2016/07/18-07:48:41:308956-voldata-client-0-0-0 (version: 3.6.9)
> [2016-07-18 07:48:42.915110] I [login.c:82:gf_auth] 0-auth/login: allowed
> user names: 7a00c0d0-e728-4c9a-92d7-819a10952886
> [2016-07-18 07:48:42.915136] I [server-handshake.c:585:server_setvolume]
> 0-voldata-server: accepted client from
> SN-0-1894-2016/07/18-07:48:42:894933-voldata-client-0-0-0 (version: 3.6.9)
> [2016-07-18 07:48:53.873241] I [server-handshake.c:585:server_setvolume]
> 0-voldata-server: accepted client from
> MN-0-740-2016/07/18-07:48:53:295469-voldata-client-0-0-0 (version: 3.6.9)
> [2016-07-18 07:48:59.645589] I [server-handshake.c:585:server_setvolume]
> 0-voldata-server: accepted client from
> MN-1-728-2016/07/18-07:48:59:628747-voldata-client-0-0-0 (version: 3.6.9)
> [2016-07-18 07:56:06.939668] I [server.c:518:server_rpc_notify]
> 0-voldata-server: disconnecting connection from
> SN-1-1152-2016/07/18-07:48:09:699332-voldata-client-0-0-0
> [2016-07-18 07:56:06.939692] I [client_t.c:417:gf_client_unref]
> 0-voldata-server: Shutting down connection
> SN-1-1152-2016/07/18-07:48:09:699332-voldata-client-0-0-0
> [2016-07-18 07:56:06.939722] I [server.c:518:server_rpc_notify]
> 0-voldata-server: disconnecting connection from
> SN-2-1134-2016/07/18-07:48:10:12643-voldata-client-0-0-0
> [2016-07-18 07:56:06.939735] I [client_t.c:417:gf_client_unref]
> 0-voldata-server: Shutting down connection
> SN-2-1134-2016/07/18-07:48:10:12643-voldata-client-0-0-0
> [2016-07-18 07:56:06.942400] W [glusterfsd.c:1211:cleanup_and_exit] (--> 0-:
> received signum (15), shutting down

Pranith,

Do you want to check this behavior?

~Atin
> 
> After the last message, glusterfsd restarts and after a few minutes it comes
> on the same situation with above messages printed again. During the short
> period of operation we observed many disconnections and re-connections of
> gluster clients.
> 
> At the same time, glusterd prints the following messages:
> 
> /usr/lib64/glusterfs/glusterfs/3.6.9/xlator/mgmt/glusterd.
> so(glusterd_mgmt_v3_unlock+0x356)[0x7f9df8acc1f5] (-->
> /usr/lib64/glusterfs/glusterfs/3.6.9/xlator/mgmt/glusterd.
> so(__glusterd_peer_rpc_notify+0x289)[0x7f9df8a2da4b] (-->
> /usr/lib64/glusterfs/glusterfs/3.6.9/xlator/mgmt/glusterd.
> so(glusterd_big_locked_notify+0x61)[0x7f9df8a22384] (-->
> /usr/lib64/glusterfs/glusterfs/3.6.9/xlator/mgmt/glusterd.
> so(glusterd_peer_rpc_notify+0x38)[0x7f9df8a2dbea] ))))) 0-management: Lock
> for vol services not held
> The message "I [MSGID: 106006]
> [glusterd-handler.c:4290:__glusterd_nodesvc_rpc_notify] 0-management: nfs
> has disconnected from glusterd." repeated 39 times between [2016-07-18
> 07:58:08.475445] and [2016-07-18 08:00:06.414826]
> 
> and after a while connection with the other storage server and quorum server
> lost.
> 
> Could you please check and provide some help?
> 
> Version-Release number of selected component (if applicable):
> v3.6.9
> 
> How reproducible:
> Often
> 
> Steps to Reproduce:
> 1.
> 2.
> 3.
> 
> Actual results:
> 
> 
> Expected results:
> 
> 
> Additional info:
> 
> FYI we have back-ported the following patches from 3.7 branch:
> http://review.gluster.org/#/c/10829/
> http://review.gluster.org/#/c/13358/
> http://review.gluster.org/#/c/14052/
> http://review.gluster.org/#/c/12310/
> http://review.gluster.org/#/c/13392/

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list