[Gluster-devel] Re: Timeout settings and self-healing ? (WAS: HA failover test unsuccessful (inaccessible mountpoint))
Krishna Srinivas
krishna at zresearch.com
Mon Apr 28 13:12:00 UTC 2008
Guido,
Can you paste the server and client spec files again?
(it has got deleted from the pastebin)
Make sure you are using unify on client side and have set transport-timeout
to 10 secs.
If possible try to reproduce the problem you are seeing with minimal
spec file.
Thanks
Krishna
On Sat, Apr 26, 2008 at 4:36 AM, Amar S. Tumballi <amar at zresearch.com> wrote:
>
>
> On Wed, Apr 23, 2008 at 3:47 AM, Guido Smit <guido at comlog.nl> wrote:
> > Krishna,
> >
> > I did the test. I killed glusterfsd on one server.
> > All tests (ls, df, cp) worked like it should. I didn't even notice any
> difference. Unplugging the cable however, blocked all operations and finally
> after a few minutes
> > the transport endpoint message appears.
> >
> >
> >
> >
> The problem with TCP/IP is that when you unplug the cable, there is no
> messages sent to application's poll() on network. Driver internally tries to
> reconnect, and only after a long time. (it was around 10+minutes when we
> tested) we get message saying no route to host. But when applications die on
> server, or there is a shutdown, the connected nodes get a notification,
> hence everything will be smooth. Hence the delay in case of network cable
> unplugging.
>
> We came with an work around for managing this delay, that was
> 'transport-timeout' option, which times out each request after certain time.
> The default is '108's now. We kept it as high as this considering few
> applications which use mandatory locks, (block the write till a lock gets
> freed) can take easily up to 1+minutes for releasing the locks. Users have
> the option to set 'transport-timeout' (In client/protocol volume). So, they
> can tune it considering the I/O time of their apps.
>
> In our test setups, we could timeout exactly after given transport-timeout
> setting, everytime. So, the issue of freezing indefinitely, we couldn't
> reproduce.
>
>
> Regards,
> Amar
>
>
>
> --
> Amar Tumballi
> Gluster/GlusterFS Hacker
> [bulde on #gluster/irc.gnu.org]
> http://www.zresearch.com - Commoditizing Super Storage!
More information about the Gluster-devel
mailing list