[Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

Tue Nov 25 12:17:46 UTC 2008

Fred,
  Can you also provide us server logs?

--
gowda

On Tue, Nov 25, 2008 at 4:57 PM, Fred Hucht <fred at thp.uni-due.de> wrote:

> Hi devels!
>
> We consider GlusterFS as parallel file server (8 server nodes) for our
> parallel Opteron cluster (88 nodes, ~500 cores), as well as for a unified
> nufa /scratch distributed over all nodes. We use the cluster within a
> scientific environment (theoretical physics) and use Scientific Linux with
> kernel 2.6.25.16. After similar problems with 1.3.x we installed 1.4.0qa61
> and set up a /scratch for testing using the following script
> "glusterconf.sh" which runs local on all nodes on startup and writes the two
> config files /usr/local/etc/glusterfs-{server,client}.vol:
>
> ---------------------------------- 8< snip >8
> ----------------------------------
> #!/bin/sh
>
> HOST=$(hostname -s)
>
> if [ $HOST = master ];then
>    MASTER_IP=127.0.0.1
>    HOST_IP=127.0.0.1
>    HOST_N=0
> else
>    MASTER_IP=192.168.1.254
>    HOST_IP=$(hostname -i)
>    HOST_N=${HOST_IP##*.}
> fi
>
> LOCAL=sc$HOST_N
>
> ###################################################################
> # write /usr/local/etc/glusterfs-server.vol
> {
>
> cat <<EOF
> ###
> ### Server config automatically created by $PWD/$0
> ###
>
> EOF
>
> if [ $HOST = master ];then
>    SERVERVOLUMES="scns"
>    cat <<EOF
> volume scns
>  type storage/posix
>  option directory /export/scratch_ns
> end-volume
>
> EOF
> else # if master
>    SERVERVOLUMES=""
> fi   # if master
>
> SERVERVOLUMES="$SERVERVOLUMES $LOCAL"
> cat <<EOF
> volume $LOCAL-posix
>  type storage/posix
>  option directory /export/scratch
> end-volume
>
> volume $LOCAL-locks
>  type features/posix-locks
>  subvolumes $LOCAL-posix
> end-volume
>
> volume $LOCAL-ioth
>  type performance/io-threads
>  option thread-count 4
>  subvolumes $LOCAL-locks
> end-volume
>
> volume $LOCAL
>  type performance/read-ahead
>  subvolumes $LOCAL-ioth
> end-volume
>
> volume server
>  type protocol/server
>  option transport-type tcp/server
>  subvolumes $SERVERVOLUMES
> EOF
>
> for vol in $SERVERVOLUMES;do
>    cat <<EOF
>  option auth.addr.$vol.allow 127.0.0.1,192.168.1.*
> EOF
> done
>
> cat <<EOF
> end-volume
>
> EOF
>
> } > /usr/local/etc/glusterfs-server.vol
>
> ###################################################################
> # write /usr/local/etc/glusterfs-client.vol
> {
> cat <<EOF
> ###
> ### Client config automatically created by $PWD/$0
> ###
>
> volume scns
>  type protocol/client
>  option transport-type tcp/client
>  option remote-host $MASTER_IP
>  option remote-subvolume scns
> end-volume
>
> volume sc0
>  type protocol/client
>  option transport-type tcp/client
>  option remote-host $MASTER_IP
>  option remote-subvolume sc0
> end-volume
>
> EOF
>
> UNIFY="sc0"
>
> # leave out node66 at the moment...
>
> for n in $(seq 65) $(seq 67 87);do
>    VOL=sc$n
>    UNIFY="$UNIFY $VOL"
>        cat <<EOF
> volume $VOL
>  type protocol/client
>  option transport-type tcp/client
>  option remote-host 192.168.1.$n
>  option remote-subvolume $VOL
> end-volume
>
> EOF
> done
>
> cat <<EOF
> volume scratch
>  type cluster/unify
>  subvolumes $UNIFY
>  option namespace scns
>  option scheduler nufa
>  option nufa.limits.min-free-disk 15
>  option nufa.refresh-interval 10
>  option nufa.local-volume-name $LOCAL
> end-volume
>
> volume scratch-io-threads
>  type performance/io-threads
>  option thread-count 4
>  subvolumes scratch
> end-volume
>
> volume scratch-write-behind
>  type performance/write-behind
>  option aggregate-size 128kB
>  option flush-behind off
>  subvolumes scratch-io-threads
> end-volume
>
> volume scratch-read-ahead
>  type performance/read-ahead
>  option page-size 128kB # unit in bytes
>  option page-count 2    # cache per file  = (page-count x page-size)
>  subvolumes scratch-write-behind
> end-volume
>
> volume scratch-io-cache
>  type performance/io-cache
>  option cache-size 64MB
>  option page-size 512kB
>  subvolumes scratch-read-ahead
> end-volume
>
> EOF
>
> } > /usr/local/etc/glusterfs-client.vol
> ---------------------------------- 8< snip >8
> ----------------------------------
>
> The cluster uses MPI over Infiniband, while GlusterFS runs over TCP/IP
> Gigabit Ethernet. I use FUSE 2.7.4 with patch fuse-2.7.3glfs10.diff (Is that
> OK? The patch succeeded)
>
> Everything is fine until some nodes which are used by a job block on access
> to /scratch or, sometimes later, give
>
> df: `/scratch': Transport endpoint is not connected
>
> The glusterfs.log on node36 is flooded by
>
> 2008-11-25 07:30:35 E [client-protocol.c:243:call_bail] sc70: activating
> bail-out. pending frames = 3. last sent = 2008-11-25 07:29:52. last received
> = 2008-11-25 07:29:49. transport-timeout = 42
> 2008-11-25 07:30:35 C [client-protocol.c:250:call_bail] sc70: bailing
> transport
> ...(~100MB)
>
> (~2 lines for every node every 10 seconds) Furthermore, I find at the end
> of glusterfs.log:
>
> grep -v call_bail glusterfs.log
> ...
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] sc0: transport not
> connected to submit (priv->connected = 255)
> ...
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] sc87: transport not
> connected to submit (priv->connected = 255)
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] scns: transport not
> connected to submit (priv->connected = 255)
> 2008-11-25 10:05:03 E [fuse-bridge.c:1886:fuse_statfs_cbk] glusterfs-fuse:
> 1353: ERR => -1 (Transport endpoint is not connected)
>
> On node68 I find
>
> 2008-11-24 23:20:12 W [client-protocol.c:93:this_ino_set] sc0: inode
> number(201326854) changed for inode(0x6130d0)
> 2008-11-24 23:20:12 W [client-protocol.c:93:this_ino_set] scns: inode
> number(37749030) changed for inode(0x6130d0)
> 2008-11-24 23:20:58 E [client-protocol.c:243:call_bail] scns: activating
> bail-out. pending frames = 3. last sent = 2008-11-24 23:20:12. last received
> = 2008-11-24 23:20:12. transport-timeout = 42
> 2008-11-24 23:20:58 C [client-protocol.c:250:call_bail] scns: bailing
> transport
> 2008-11-24 23:20:58 E [client-protocol.c:243:call_bail] sc0: activating
> bail-out. pending frames = 3. last sent = 2008-11-24 23:20:12. last received
> = 2008-11-24 23:20:12. transport-timeout = 42
> 2008-11-24 23:20:58 C [client-protocol.c:250:call_bail] sc0: bailing
> transport
> ...(~100MB)
>
> only for scns and sc0 and then
>
> 2008-11-25 10:01:31 E [client-protocol.c:243:call_bail] sc1: activating
> bail-out. pending frames = 1. last sent = 2008-11-25 10:00:46. last received
> = 2008-11-24 23:20:12. transport-timeout = 42
> 2008-11-25 10:01:31 C [client-protocol.c:250:call_bail] sc1: bailing
> transport
> ...(~100MB)
>
> for all nodes, as well as
>
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] sc0: transport not
> connected to submit (priv->connected = 255)
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] scns: transport not
> connected to submit (priv->connected = 255)
> 2008-11-25 11:23:18 E [socket.c:1187:socket_submit] sc1: transport not
> connected to submit (priv->connected = 255)
> 2008-11-25 11:23:18 E [socket.c:1187:socket_submit] sc2: transport not
> connected to submit (priv->connected = 255)
> ...
>
> The third affected node node77 says:
>
> 2008-11-24 22:07:20 W [client-protocol.c:93:this_ino_set] sc0: inode
> number(201326854) changed for inode(0x7f97d6c0ac70)
> 2008-11-24 22:07:20 W [client-protocol.c:93:this_ino_set] scns: inode
> number(37749030) changed for inode(0x7f97d6c0ac70)
> 2008-11-24 22:08:07 E [client-protocol.c:243:call_bail] sc10: activating
> bail-out. pending frames = 7. last sent = 2008-11-24 22:07:24. last received
> = 2008-11-24 22:07:20. transport-timeout = 42
> 2008-11-24 22:08:07 C [client-protocol.c:250:call_bail] sc10: bailing
> transport
> ...(~100MB)
>
> and then
>
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] sc0: transport not
> connected to submit (priv->connected = 255)
> ...
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] sc87: transport not
> connected to submit (priv->connected = 255)
> 2008-11-25 10:00:46 E [socket.c:1187:socket_submit] scns: transport not
> connected to submit (priv->connected = 255)
>
>
> As I said, similar problems occurred with version 1.3.x. If these problems
> cannot be solved, we have to use a different file system, so any help is
> very appreciated.
>
> Have fun,
>
>     Fred
>
> Dr. Fred Hucht <fred at thp.Uni-DuE.de>
> Institute for Theoretical Physics
> University of Duisburg-Essen, 47048 Duisburg, Germany
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>

-- 
hard work often pays off after time, but laziness always pays off now
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20081125/1bd52ad5/attachment-0003.html>