[Gluster-users] df hang while a brick server down (rdma transport)

lierihanmei lierihanmei at 163.com
Wed Nov 6 16:33:07 UTC 2013


Hi all,
We are using glusterfs on a cluster of some servers, connecting with Infiniband. 
While using rdma, if one of these servers is down, "mount" is fine but commands such as "df" with hang.


This is the steps to reproduce.
 - create a volume of  3 bricks with rdma transport, each brick on a different server
 - start the volume
 - down a brick server
 - after mount the volume, "df -h" will hang


We have tested on glusterfs3.2.5&3.2.7, all have this problem.


Thanks for any help.
Let me know if there's anything else I can provide!


Here is a piece of glusterfs log:
......
[2013-10-22 11:07:41.335497] I [client-handshake.c:1090:select_server_supported_programs] 0-hash-01-client-0: Using Program GlusterFS 3.0.0, Num (1298437), Version (310)
[2013-10-22 11:07:41.335618] I [client-handshake.c:1090:select_server_supported_programs] 0-hash-01-client-2: Using Program GlusterFS 3.0.0, Num (1298437), Version (310)
[2013-10-22 11:07:41.335995] I [client-handshake.c:913:client_setvolume_cbk] 0-hash-01-client-0: Connected to 192.168.20.107:24013, attached to remote volume '/data/brick1'.
[2013-10-22 11:07:41.336119] I [client-handshake.c:913:client_setvolume_cbk] 0-hash-01-client-2: Connected to 192.168.20.108:24014, attached to remote volume '/data/brick1'.
[2013-10-22 11:07:41.591835] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:41.591950] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:41.591996] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:44.592810] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:44.592917] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:44.592963] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
(loop)
......


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131107/cbfa770e/attachment.html>


More information about the Gluster-users mailing list