[Gluster-devel] Need sensible default value for detecting unclean client disconnects
Niels de Vos
ndevos at redhat.com
Tue May 20 16:26:14 UTC 2014
On Tue, May 20, 2014 at 01:30:24PM +0200, Niels de Vos wrote:
> Hi all,
>
> the last few days I've been looking at a problem [1] where a client
> locks a file over a FUSE-mount, and a 2nd client tries to grab that lock
> too. It is expected that the 2nd client gets blocked until the 1st
> client releases the lock. This all work as long as the 1st client
> cleanly releases the lock.
>
> Whenever the 1st client crashes (like a kernel panic) or the network is
> split and the 1st client is unreachable, the 2nd client may not get the
> lock until the bricks detect that the connection to the 1st client is
> dead. If there are pending Replies, the bricks may need 15-20 minutes
> until the re-transmissions of the replies have timed-out.
>
> The current default of 15-20 minutes is quite long for a fail-over
> scenario. Relatively recently [2], the Linux kernel got
> a TCP_USER_TIMEOUT socket option (similar to TCP_KEEPALIVE). This option
> can be used to configure a per-socket timeout, instead of a system-wide
> configuration through the net.ipv4.tcp_retries2 sysctl.
>
> The default network.ping-timeout is set to 42 seconds. I'd like to
> propose a network.tcp-timeout option that can be set per volume. This
> option should then set TCP_USER_TIMEOUT for the socket, which causes
> re-transmission failures to be fatal after the timeout has passed.
>
> Now the remaining question, what shall be the default timeout in seconds
> for this new network.tcp-timeout option? I'm currently thinking of
> making it high enough (like 5 minutes) to prevent false positives.
>
> Thoughts and comments welcome,
> Niels
>
>
> 1 https://bugzilla.redhat.com/show_bug.cgi?id=1099460
> 2 http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=dca43c7
Posted a patch for review: http://review.gluster.org/7814
More information about the Gluster-devel
mailing list