[Gluster-devel] Single-process (server and client) AFR problems
Gordan Bobic
gordan at bobich.net
Mon May 19 20:26:51 UTC 2008
Hi,
I'm having rather major problems getting single-process AFR to work
between two servers. When both servers come up, the GlusterFS on both
locks up pretty solid. The processes that try to access the FS
(including ls) seem to get nowhere for a few minutes, and then complete.
But something gets stuck, and glusterfs cannot be killed even with -9!
Another worrying thing is that fuse kernel module ends up having a
reference count even after glusterfs process gets killed (sometimes
killing the remote process that isn't locked up on it's host can break
the locked-up operations and allow for the local glusterfs process to be
killed). So fuse then cannot be unloaded.
This error seems to come up in the logs all the time:
2008-05-19 20:57:17 E [afr.c:1985:afr_selfheal] home: none of the
children are up for locking, returning EIO
2008-05-19 20:57:17 E [fuse-bridge.c:692:fuse_fd_cbk] glusterfs-fuse:
63: (12) /test => -1 (5)
This implies come kind of a locking issue, but the same error and
conditions also arise when posix locking module is removed.
The configs for the two servers are attached. They are almost identical
to the examples on the glusterfs wiki:
http://www.gluster.org/docs/index.php/AFR_single_process
What am I doing wrong? Have I run into another bug?
Gordan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: home.vol.1
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20080519/64579cf6/attachment-0006.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: home.vol.2
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20080519/64579cf6/attachment-0007.ksh>
More information about the Gluster-devel
mailing list