[Bugs] [Bug 1422781] Transport endpoint not connected error seen on client when glusterd is restarted

bugzilla at redhat.com bugzilla at redhat.com
Mon Mar 6 13:55:59 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1422781



--- Comment #11 from Atin Mukherjee <amukherj at redhat.com> ---
Jeff,

I was able to hit this issue again. Here is how the volume status looks like:

root at 15e82395bcbc:/home/glusterfs# gluster v status
Status of volume: test-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 172.17.0.2:/tmp/b1                    49152     0          Y       25014
Brick 172.17.0.3:/tmp/b1                    49154     0          Y       1727 
Brick 172.17.0.4:/tmp/b1                    49154     0          Y       20966
Brick 172.17.0.2:/tmp/b2                    49152     0          Y       25014
Brick 172.17.0.3:/tmp/b2                    49154     0          Y       1727 
Brick 172.17.0.4:/tmp/b2                    49154     0          Y       20966
Self-heal Daemon on localhost               N/A       N/A        N       N/A  
Self-heal Daemon on 172.17.0.3              N/A       N/A        Y       1759 
Self-heal Daemon on 172.17.0.4              N/A       N/A        Y       20998

Task Status of Volume test-vol
------------------------------------------------------------------------------
There are no active volume tasks


When I mount the client in 172.17.0.2 and ran touch f{1..100000} and kill
glusterd on the same node mount started throwing "Transport endpoint is not
connected" errors for all the files.

Mount log shows the following:

[2017-03-06 13:49:36.437709] I [fuse-bridge.c:5802:fini] 0-fuse: Unmounting
'/mnt/test-vol-mnt'.
[2017-03-06 13:49:36.442986] W [socket.c:593:__socket_rwv] 0-test-vol-client-3:
readv on 172.17.0.2:49154 failed     (Connection reset by peer)
[2017-03-06 13:49:36.443247] E [rpc-clnt.c:365:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.            
0(_gf_log_callingfn+0x12a)[0x7faa629c4a0a] (-->
/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+                   
0x1b5)[0x7faa6278bd05] (-->
/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7faa6278bdfe] (-->
/usr/local/ 
lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7faa6278d379] (-->
/usr/local/lib/libgfrpc.so.                
0(rpc_clnt_notify+0x84)[0x7faa6278dc04] ))))) 0-test-vol-client-3: forced
unwinding frame type(GlusterFS 3.3)        op(CREATE(23)) called at 2017-03-06
13:49:36.440699 (xid=0x268a)
[2017-03-06 13:49:36.443264] W [MSGID: 114031]
[client-rpc-fops.c:2332:client3_3_create_cbk] 0-test-vol-client-3:    remote
operation failed. Path: /f1244 [Transport endpoint is not connected]
[2017-03-06 13:49:36.446152] I [socket.c:3476:socket_submit_request]
0-test-vol-client-3: not connected (priv-       >connected = 0)
[2017-03-06 13:49:36.446170] W [rpc-clnt.c:1656:rpc_clnt_submit]
0-test-vol-client-3: failed to submit rpc-request   (XID: 0x268b Program:
GlusterFS 3.3, ProgVers: 330, Proc: 33) to rpc-transport (test-vol-client-3)
[2017-03-06 13:49:36.446183] E [MSGID: 114031]
[client-rpc-fops.c:1758:client3_3_xattrop_cbk] 0-test-vol-client-3:   remote
operation failed. Path: / (00000000-0000-0000-0000-000000000001)
[2017-03-06 13:49:36.446249] I [timer.c:198:gf_timer_registry_init]
(-->/usr/local/lib/libgfrpc.so.                  0(rpc_transport_notify+0x23)
[0x7faa6278a3b3] -->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x2b0)        
        [0x7faa6278de30]
-->/usr/local/lib/libglusterfs.so.0(gf_timer_call_after+0x265) [0x7faa629cf3b5]
) 0-timer: ctx      cleanup started
[2017-03-06 13:49:36.446283] E [timer.c:44:gf_timer_call_after]
(-->/usr/local/lib/libgfrpc.so.                     
0(rpc_transport_notify+0x23) [0x7faa6278a3b3]
-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x2b0)                
[0x7faa6278de30] -->/usr/local/lib/libglusterfs.so.0(gf_timer_call_after+0x2a9)
[0x7faa629cf3f9] ) 0-timer: !reg
[2017-03-06 13:49:36.446289] W [rpc-clnt.c:893:rpc_clnt_handle_disconnect]
0-test-vol-client-3: Cannot create        rpc_clnt_reconnect timer
[2017-03-06 13:49:36.446296] I [MSGID: 114018]
[client.c:2276:client_rpc_notify] 0-test-vol-client-3: disconnected   from
test-vol-client-3. Client process will keep trying to connect to glusterd until
brick's port is available
[2017-03-06 13:49:36.446575] E [MSGID: 114031]
[client-rpc-fops.c:1646:client3_3_entrylk_cbk] 0-test-vol-client-3:   remote
operation failed [Transport endpoint is not connected]
[2017-03-06 13:49:36.446593] E [MSGID: 108007]
[afr-lk-common.c:825:afr_unlock_entrylk_cbk] 0-test-vol-replicate-1: /f1244:
unlock failed on test-vol-client-3 [Transport endpoint is not connected]
[2017-03-06 13:49:36.457023] W [glusterfsd.c:1329:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x75ba)              [0x7faa6181d5ba]
-->/usr/local/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x408615]
-->/usr/local/sbin/               glusterfs(cleanup_and_exit+0x4b) [0x4084ab] )
0-: received signum (15), shutting down
[2017-03-06 13:49:36.457023] W [glusterfsd.c:1329:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x75ba)              [0x7faa6181d5ba]
-->/usr/local/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x408615]
-->/usr/local/sbin/               glusterfs(cleanup_and_exit+0x4b) [0x4084ab] )
0-: received signum (15), shutting down

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=SD5Nr1YuYv&a=cc_unsubscribe


More information about the Bugs mailing list