[Gluster-users] 'Primary' brick outage or reboot issues

Fri Aug 7 09:26:15 UTC 2009

----- "Justice London" <jlondon at lawinfo.com> wrote:

> It appears that if the first brick in a replicated/distributed
> configuration is rebooted or suffers some sort of a temporary issue, it both means
> that the system doesn't appear to be dropped after 10 seconds from the
> cluster and also that after it comes back up, pending transactions have issues
> for the next 10 minutes or so. Is this a locks issue or is this a bug?

If the first subvolume silently goes down (without resetting the connection)
then an 'ls' will hang for 10 seconds (this is the "ping-pong" timeout) because
replicate will not notice until then that the server has failed. Other operations
should work fine, though.

Can you elaborate what you mean by 'pending transactions' and what kind of
issues they face?

Vikas
-- 
Engineer - http://gluster.com/