[Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

Harald Stürzebecher haralds at cs.tu-berlin.de
Tue Nov 25 14:21:20 UTC 2008


2008/11/25 Fred Hucht <fred at thp.uni-due.de>:
> Hello Harald!
> I didn't test Infiniband transport until now, as I don't want to interfere
> with the parallel applications which are running over Infiniband. Gigabit
> Ethernet throughput would be sufficient for us at the moment.
> Today "only" three nodes were affected, yesterday it were nine nodes. The
> problems only occur on nodes to which jobs are scheduled which use /scratch
> as working directory: We test the filesystem in normal operation, one user
> submits jobs to the queueing system which use /scratch/... as working
> directory. While some of his jobs run without problems, other jobs fail due
> to FS problems. No problems occur over the usual NFS home directory.

IMHO, the fact that everything else works rules out the "network
problem". Sorry for wasting your time.

> When I test the FS with, e.g., dd on all nodes in parallel, no problems
> occur.h
> Which timeout shall I increase?

I had some "transport-timeout" in the back of my mind but the doc
 says that the default already is 30 seconds.
I'd not change anything there without request from the developers.

Harald Stürzebecher

More information about the Gluster-devel mailing list