[Gluster-devel] AFR: machine crash hangs other mounts or transport endpoint not connected

Gerry Reno greno at verizon.net
Sat Apr 26 00:36:32 UTC 2008


Hi all,
  Been a while since I've been on the list but I've been using GlusterFS 
for a while in an AFR setup.  I'm on SVN 747 right now.  Real simple, 
two bricks on ext3 with user_xattr.  It is storage for mailstore.  The 
issue that I've been battling is that when one of the machines crash, 
the other machine loses the mailstore with either the transport endpoint 
disconnect or the glusterfs filesystem is hung.  You cannot do anything 
with it. 'ls' it, 'df' it, ... nothing.  If I try to kill glusterfs/d it 
just gives me /glusterfsmount busy.  The only recovery at this point is 
to reboot the good machine as well as the failed machine.  So needing to 
do that is sort of defeating my purpose of creating this array.  Is 
there no way that glusterfs can recover from the crash such that things 
are still good on the other bricks and mounts on other machines? 

Thanks,
Gerry






More information about the Gluster-devel mailing list