[Gluster-users] glusterfs client waiting on SYN_SENT to connect...

Liam Slusser lslusser at gmail.com
Sat Dec 4 00:25:18 UTC 2010


Hey all,

I've run into a weird problem.  I have a few client boxes that
occasionally crash due to a non-gluster related problem.  But once the
box comes back up i cannot get the Gluster client to reconnect to the
bricks.

Centos 5 64bit and Gluster 2.0.9

df shows:

df: `/mnt/mymount': Transport endpoint is not connected

[root at client~]# netstat -pan|grep glus

tcp        0      1 10.8.10.107:1000            10.8.11.102:6996
     SYN_SENT    3385/glusterfs
tcp        0      1 10.8.10.107:1001            10.8.11.102:6996
     SYN_SENT    3385/glusterfs
tcp        0      1 10.8.10.107:998             10.8.11.102:6996
     SYN_SENT    3385/glusterfs
tcp        0      1 10.8.10.107:996             10.8.11.102:6996
     SYN_SENT    3385/glusterfs
tcp        0      1 10.8.10.107:1003            10.8.11.101:6996
     SYN_SENT    3385/glusterfs
tcp        0      1 10.8.10.107:1002            10.8.11.101:6996
     SYN_SENT    3385/glusterfs
tcp        0      1 10.8.10.107:997             10.8.11.101:6996
     SYN_SENT    3385/glusterfs
tcp        0      1 10.8.10.107:999             10.8.11.101:6996
     SYN_SENT    3385/glusterfs

from the gluster client log:

+------------------------------------------------------------------------------+
[2010-12-03 15:48:28] W [glusterfsd.c:526:_log_if_option_is_invalid]
readahead: option 'page-size' is not recognized
[2010-12-03 15:48:28] N [glusterfsd.c:1306:main] glusterfs: Successfully started
[2010-12-03 15:48:29] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 2: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:48:30] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 3: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:48:31] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 4: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:48:31] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 5: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:48:32] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 6: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1b:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1b:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2b:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2b:
connection to  failed (Connection timed out)
[2010-12-03 15:59:46] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 7: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:59:47] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 8: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:59:54] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 9: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:59:55] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 10: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:59:55] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 11: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:59:55] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 12: ERR => -1 (Transport endpoint is not connected)
[2010-12-03 15:59:56] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 13: ERR => -1 (Transport endpoint is not connected)

However, the port is obviously open...

[root at client~]# telnet 10.8.11.102 6996
Trying 10.2.56.102...
Connected to glusterserverb (10.8.11.102).
Escape character is '^]'.
^]
telnet> close
Connection closed.

The gluster server log doesnt see ANY connection attempts from the
client however it DOES see my telnet tcp attempts.  I'm using IP
addresses in all my configuration files - no names.  I do have a
Juniper firewall between the two servers that is doing stateful
firewalling and i've set it up for the connections to never timeout -
and ive never had a problem once it finally connects.  And i can
create a new connection with telnet but not the client...

Anybody seen anything like this before?  Ideas?

thanks,
liam



More information about the Gluster-users mailing list