[Gluster-users] The continuing story ...

Anand Avati avati at gluster.com
Wed Sep 9 05:23:12 UTC 2009


> Yep, I experience this exact lock-up state on the 2.x train of GlusterFS
> with two severs, each with local client, and have so far given up testing :(
> - I run 1.3 in production which still has problems when one of the servers
> goes down, and was hoping to move up to 2.x quickly, but cant at the moment.
>
>  Every time a new version comes out I update hoping it will be solved.
>
>  Because the machine that hangs, hangs so completely one can't ssh in and
> can't get a proper dump from the process, and any DEBUG log enabled has no
> information in it either, so I haven't been able to provide anything useful
> to the team to work from :(

Daniel,
 Since you say your machines have glusterfs mounts as well, we would
like to know if you can do some debugging by having an open login
before you start the filesystem and once you face the hang, can you
tell if the "hang" is on the backend fs or on the glusterfs
mountpoint? you can kill -11 the glusterfsd process and it will dump
the pending syscall info in the logfile which can be of great help.

While the symptoms can be very similar to the issue on this thread,
note that the thread is about system hang where there is no glusterfs
mountpoint, and the hang is confirmed to be on the backend fs. We are
very much interested to debug and fix _glusterfs_ mountpoint hangs.

Avati



More information about the Gluster-users mailing list