[Gluster-devel] Blocking client when server is down

Wed Dec 31 16:52:18 UTC 2008

--- On Wed, 12/31/08, Harald Stürzebecher <haralds at cs.tu-berlin.de> wrote:

> 2008/12/31 Martin Fick <mogulguy at yahoo.com>:
> > --- On Tue, 12/30/08, Basavanagowda Kanur
> <gowda at zresearch.com> wrote:
> >
> >> If server is down for transport-timout time, then
> client
> >> returns all the calls with 'Transport Endpoint
> not connected'
> >> error.
> >
> > Yes, this is exactly what I do not want.  I want
> reads/writes to simply block when the server is down and to
> complete (the blocked calls) when the server returns.  I do
> not want my applications to get an error, only a delay. 
> Without this it is not possible to recover gracefully from a
> server/network failure.
> >
> > While we are at it, what is the timeout in, seconds,
> milliseconds?
> 
> http://www.gluster.org/docs/index.php/Translators_v1.4#client
> says:
> "# option transport-timeout 30            # seconds to
> wait for a response
>                                          # from server for
> each request"
> 
> Setting that to 604800 should give you a week to fix the
> server. ;-) I hope it will try to reconnect sometimes to see if the
> server is up again.

Thanks, I missed that.  But, unfortunately it doesn't work the way you are suggesting (that's why I was asking, to confirm that it was indeed seconds).  If you simply kill the server daemon, it will fail the connection immediately, despite any long timeouts that you set.  I suppose that is because it will kill the tcp connection.  It appears that the glusterfs protocol simply cannot deal with resending requests, I suppose it expects TCP to do that for you?  But if a server goes down after the TCP request was received and TCP acked, but before it was serviced and responded to at the gluterfs protocol layer, I do not believe that glusterfs knows how to retransmit the request, this is where the timeout comes into play I believe.  I think that is the root cause for why blocking is not currently implemented.

This timeout is only useful when the connection still exists but the server is not responding, i.e. if you stop glusterfd in foreground with ^Z and then start it again with 'fg', in under the timeout value, it will survive this.  I assume a downed network link would be affected by this too, if the link is not down long enough to time out the TCP connection.  This makes this timeout useful only if you have a heavily loaded server or network that cannot respond to you and you actually want to timeout.  And then, what?  It is not useful for extending recovery.  I am not sure how timing out in this case really helps anything anyway, except for when using AFR or the HA translators perhaps.

More food for the wiki I suppose, :)

-Martin