[Gluster-devel] Re: Timeout settings and self-healing ? (WAS: HA failover test unsuccessful (inaccessible mountpoint))
Guido Smit
guido at comlog.nl
Tue Apr 22 07:49:51 UTC 2008
My server configs:
http://glusterfs.pastebin.com/m3f82f264
One of the client config:
http://glusterfs.pastebin.com/d5df7fab
My problem is, when one of the storage servers is unplugged, I always
get the
Transport endpoint is not connected message.
Krishna Srinivas wrote:
> Guido,
>
> Can you give the setup details, conf files?
> you can use http://glusterfs.pastebin.com for pasting conf files.
>
> Thanks
> Krishna
>
> On Fri, Apr 4, 2008 at 2:40 PM, Anand Avati <avati at zresearch.com> wrote:
>
>> Daniel/Guido,
>> can you paste the logs which are relevant from the time of unplugging the
>> cable till the end of experiment?
>>
>> avati
>>
>> 2008/4/3, Daniel Maher <dma+gluster at witbe.net <dma%2Bgluster at witbe.net>>:
>>
>>
>>
>> > On Thu, 3 Apr 2008 14:55:48 +0530 "Anand Avati" <avati at zresearch.com>
>> > wrote:
>> >
>> > > Daniel,
>> > > maybe it is just taking long to detect connection failure. Can you
>> > > try with 'option transport-timeout 20' (sets response timeout to 20
>> > > seconds) in all your protocol/client and see if you still face the
>> > > 'hang' ?
>> >
>> > My simple test case is as follows :
>> > 1. Unplug one of the nodes (dfsD)
>> > 2. Attempt to ls -l the /opt/ (in which gfs-mount/ - the mountpoint -
>> > is contained)
>> >
>> > I set the timeout option along with every client instance in both the
>> > client and server configs. I tested timeout settings of 10 and 20
>> > seconds (just to see). In both cases, the 'hang' releases after a while
>> > (approx 30 seconds), but the results are odd. For example :
>> >
>> > # ls -l
>> > (hang ~ 30 seconds)
>> > ls: cannot access gfs-mount: Transport endpoint is not connected
>> > total 0
>> > d????????? ? ? ? ? ? gfs-mount
>> >
>> > # ls -l
>> > (immediate)
>> > ls: cannot access gfs-mount: Transport endpoint is not connected
>> > total 0
>> > d????????? ? ? ? ? ? gfs-mount
>> >
>> > (user wait ~ 5 seconds)
>> >
>> > # ls -l
>> > total 8
>> > drwxr-xr-x 2 root root 4096 2008-04-03 09:43 gfs-mount
>> >
>> > It would appear that the "recovery" time, regardless of whether the
>> > timeout is set to 10 or 20, is around 35 to 40 seconds - though, at the
>> > very least, it recovered. Is there any reasonable way to bring this
>> > period of time down ?
>> >
>> > Thank you all so much for your feedback on this topic !
>> >
>> >
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>>
>
>
>
--
Met vriendelijke groet,
Guido Smit
ComLog B.V.
Televisieweg 133
1322 BE Almere
T. 036 5470500
F. 036 5470481
-------------- next part --------------
No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.524 / Virus Database: 269.23.3/1390 - Release Date: 4/21/2008 4:23 PM
More information about the Gluster-devel
mailing list