[Bugs] [Bug 1386626] fuse mount point not accessible

bugzilla at redhat.com bugzilla at redhat.com
Wed Oct 19 10:49:14 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1386626



--- Comment #1 from Pranith Kumar K <pkarampu at redhat.com> ---


Here are the logs which confirmed the race:

Connection starts here:
==============================================================================================================================================================
[2016-10-17 07:07:05.164177] I [MSGID: 114020] [client.c:2356:notify]
4-mdcache-client-1: parent translators are ready, attempting connect on
transport
[2016-10-17 07:07:05.170943] I [rpc-clnt.c:1947:rpc_clnt_reconfig]
4-mdcache-client-1: changing port to 49152 (from 0)
[2016-10-17 07:07:05.176348] I [MSGID: 114057]
[client-handshake.c:1446:select_server_supported_programs] 4-mdcache-client-1:
Using Program GlusterFS 3.3, Num 
(1298437), Version (330)  
[2016-10-17 07:07:05.177828] I [MSGID: 114046]
[client-handshake.c:1222:client_setvolume_cbk] 4-mdcache-client-1: Connected to
mdcache-client-1, attached to re
mote volume '/bricks/brick0/mdcache'.
[2016-10-17 07:07:05.177849] I [MSGID: 114047]
[client-handshake.c:1233:client_setvolume_cbk] 4-mdcache-client-1: Server and
Client lk-version numbers are not 
same, reopening the fds   
[2016-10-17 07:07:05.178187] I [MSGID: 114035]
[client-handshake.c:201:client_set_lk_version_cbk] 4-mdcache-client-1: Server
lk version = 1
[2016-10-17 07:39:41.137951] W [rpc-clnt.c:717:rpc_clnt_handle_cbk]
4-mdcache-client-1: RPC call decoding failed

Seems like connection went down here:
==============================================================================================================================================================
[2016-10-17 07:39:41.138350] E [MSGID: 114031]
[client-rpc-fops.c:1654:client3_3_entrylk_cbk] 4-mdcache-client-1: remote
operation failed [Transport endpoint i
s not connected]
[2016-10-17 07:39:41.138564] W [MSGID: 114031]
[client-rpc-fops.c:1102:client3_3_getxattr_cbk] 4-mdcache-client-1: remote
operation failed. Path: <gfid:0000000
0-0000-0000-0000-000000000001> (00000000-0000-0000-0000-000000000001). Key:
trusted.glusterfs.pathinfo [Transport endpoint is not connected]
[2016-10-17 07:39:41.138720] W [MSGID: 114031]
[client-rpc-fops.c:2648:client3_3_readdirp_cbk] 4-mdcache-client-1: remote
operation failed [Transport endpoint 
is not connected]
[2016-10-17 07:39:41.142014] I [socket.c:3391:socket_submit_request]
4-mdcache-client-1: not connected (priv->connected = 0)
.....
We see logs of new connection here:
==============================================================================================================================================================
[2016-10-17 07:39:41.145226] I [rpc-clnt.c:1947:rpc_clnt_reconfig]
4-mdcache-client-1: changing port to 49152 (from 0)
[2016-10-17 07:39:41.152533] I [MSGID: 114057]
[client-handshake.c:1446:select_server_supported_programs] 4-mdcache-client-1:
Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-10-17 07:39:41.152761] W [MSGID: 114031]
[client-rpc-fops.c:2938:client3_3_lookup_cbk] 4-mdcache-client-1: remote
operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport
endpoint is not connected]
[2016-10-17 07:39:41.154742] W [MSGID: 114031]
[client-rpc-fops.c:2648:client3_3_readdirp_cbk] 4-mdcache-client-1: remote
operation failed [Transport endpoint is not connected]
[2016-10-17 07:39:41.155498] I [MSGID: 114046]
[client-handshake.c:1222:client_setvolume_cbk] 4-mdcache-client-1: Connected to
mdcache-client-1, attached to remote volume '/bricks/brick0/mdcache'.
[2016-10-17 07:39:41.155531] I [MSGID: 114047]
[client-handshake.c:1233:client_setvolume_cbk] 4-mdcache-client-1: Server and
Client lk-version numbers are not same, reopening the fds   
[2016-10-17 07:39:41.155546] I [MSGID: 114042]
[client-handshake.c:1053:client_post_handshake] 4-mdcache-client-1: 22 fds open
- Delaying child_up until they are re-opened
[2016-10-17 07:39:41.156306] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 4-mdcache-client-1: reopendir
on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 0)
[2016-10-17 07:39:41.156467] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 4-mdcache-client-1: reopendir
on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 1)
....
[2016-10-17 07:39:41.161125] I [MSGID: 114041]
[client-handshake.c:675:client_child_up_reopen_done] 4-mdcache-client-1: last
fd open'd/lock-self-heal'd - notifying CHILD-UP
[2016-10-17 07:39:41.161445] I [MSGID: 114035]
[client-handshake.c:201:client_set_lk_version_cbk] 4-mdcache-client-1: Server
lk version = 1

We see the disconnect notification for the first connection here:
==============================================================================================================================================================
[2016-10-17 07:39:41.167534] W [MSGID: 114031]
[client-rpc-fops.c:2648:client3_3_readdirp_cbk] 4-mdcache-client-1: remote
operation failed [Transport endpoint is not connected]
The message "W [MSGID: 114031] [client-rpc-fops.c:2648:client3_3_readdirp_cbk]
4-mdcache-client-1: remote operation failed [Transport endpoint is not
connected]" repeated 20 times between [2016-10-17 07:39:41.167534] and
[2016-10-17 07:39:41.345856]
[2016-10-17 07:39:41.352243] I [MSGID: 114018]
[client.c:2280:client_rpc_notify] 4-mdcache-client-1: disconnected from
mdcache-client-1. Client process will keep trying to connect to glusterd until
brick's port is available
[2016-10-17 07:39:41.354562] W [MSGID: 114031]
[client-rpc-fops.c:630:client3_3_unlink_cbk] 4-mdcache-client-1: remote
operation failed [Transport endpoint is not connected]
[2016-10-17 07:39:41.355605] E [MSGID: 114031]
[client-rpc-fops.c:1766:client3_3_xattrop_cbk] 4-mdcache-client-1: remote
operation failed. Path: / (00000000-0000-0000-0000-000000000001)
The message "E [MSGID: 114031] [client-rpc-fops.c:1654:client3_3_entrylk_cbk]
4-mdcache-client-1: remote operation failed [Transport endpoint is not
connected]" repeated 2 times between [2016-10-17 07:39:41.138350] and
[2016-10-17 07:39:41.356994]

So what ends up happening is even when there is a good connection, client
xlator thinks it is disconnected, because disconnect message came after connect
of the newer one.

Pranith.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list