[Gluster-users] [Gluster-devel] Testing replication and HA

Tue Feb 11 14:56:46 UTC 2014

Thanks to everyone for their replies...

On Tue, Feb 11, 2014 at 2:37 AM, Kaushal M <kshlmster at gmail.com> wrote:
> The 42 second hang is most likely the ping timeout of the client translator.
Indeed I think it is...

>
> What most likely happened was that, the brick on annex3 was being used
> for the read when you pulled its plug. When you pulled the plug, the
> connection between the client and annex3 isn't gracefully terminated
> and the client translator still sees the connection as alive. Because
> of this the next fop is also sent to annex3, but it will timeout as
> annex3 is dead. After the timeout happens, the connection is marked as
> dead, and the associated client xlator is marked as down. Since afr
> now know annex3 is dead, it sends the next fop to annex4 which is
> still alive.
I think this sounds right... My thought was that maybe Gluster could
do better somehow. For example, if the timeout counter passes (say 1
sec) it immediately starts looking for a different brick to continue
from. This way a routine failover wouldn't interrupt activity for 42
seconds. Maybe this is a feature that could be part of the new style
replication?

>
> These kinds of unclean connection terminations are only handled by
> request/ping timeouts currently. You could set the ping timeout values
> to be lower, to reduce the detection time.
The reason I don't want to set this value significantly lower, is that
in the case of a _real_ disaster, or high load condition, I want to
have the 42 seconds to give things a chance to recover without having
to kill the "in process" client mount. So it makes sense to keep it
like this.

>
> ~kaushal

Cheers,
James