[Gluster-users] 2.0.6

Sun Aug 23 13:21:17 UTC 2009

It seems like Stephan could get behaviour that is more like his 
expectations if he reduced the timeout from 1800 seconds to something 
smaller.  Perhaps you could share where this parameter is set?  Many of 
us are running filesystems with much smaller expected latencies and 
would benefit from reducing this.  Obviously this doesn't address the 
core problem he is having -- regardless of its source.

Anand Avati wrote:
> On Sun, Aug 23, 2009 at 3:17 AM, Stephan von
> Krawczynski<skraw at ithnet.com> wrote:
>   
>> On Sat, 22 Aug 2009 10:24:48 -0700
>> Anand Avati <avati at gluster.com> wrote:
>>
>>     
>>> [... long technical explanation ...]
>>> As you rightly summarized,
>>> Your theory: glusterfs is buggy (cause) and results in all fuse
>>> mountpoints hanging, and also results in server2's backend fs hanging
>>> (effect)
>>>
>>> My theory: your backend fs is buggy (cause) and hangs and results in
>>> all fuse mountpoints to hang (effect) which happens because of reasons
>>> explained above
>>>
>>> I maintain that my theory is right because glusterfsd just cannot
>>> cause a backend filesystem to hang, and if it indeed did, the bug is
>>> in the backend fs because glusterfsd only performs system calls to
>>> access it.
>>>       
>> Lets assume your theory is right. Then I obviously managed to create a
>> scenario where the bail-out decisions for servers are clearly bad. In fact
>> they are so bad that the whole service breaks down. This is of course a no-go
>> for an application thats sole (or primary) purpose is to keep your fileservice
>> up, no matter what servers in the backend crash or vanish. As long as there is
>> a theoretical way of performing the needed fileservice it should be up and
>> running. Even iff your theory were right, still glusterfs does not handle
>> the situation as good as is could (read: as a user would expect).
>>     
>
> OK, first of all, this is now a very different issue we are trying to
> address. Correct me if I'm wrong, the new problem definition now is -
> 'when glusterfs is presented with a backend filesystem which hangs FS
> calls, the replicate module does not provide FS service' (and not any
> more, as previously described by you, 'glusterfs has not been able to
> run bonnie even for an hour on all 2.0.x releases because of lack of
> attention towards stability and concentration on featurism'). Please
> do understand that this is not at all a (regular) crash of the
> filesystem, as described, which can be reliably reproduced within an
> hour, and the dev team not caring to fix it. The problem does not
> deserve such an attack.
>
> The reason why this issue persists is - there is no reliable way to
> even detect this hang programatically. The right way to "deal" with it
> is to translate the "disk hang" into a "subvolume down" is hard,
> because -- Has the server stopped responding? No, ping-pong replies
> are coming just fine. Has the backend disk started returning IO
> errors? No, the FS calls just hang exactly like a deadlock. Detecting
> hardware failures can be done with reasonable reliability. Detecting
> buggy software lockups and such deadlocks is a very hard (theoretical)
> problem.
>
> The simplest way around it having timeouts at a higher layer. And it
> is for a reason that the current call timeouts are 1800 seconds - we
> have seen in our QA lab that truncate() call on multi terabyte large
> file on ext3 takes more than 20 minutes to complete, and during that
> period all other calls happening on that filesystem also freeze.
> Programatically this situation is no different from the hang you face.
> The 1800sec timeout currently used is based on experimental
> calculations and not arbitrary. If you can come up with a better way
> of reliably detecting that the backend FS has hung itself (even
> considering the delay situations which I explained above), we are
> willing to use that technique provided it is reasonable enough (do
> consider situations where the backend fs could be an NFS which might
> have temporarily blocked for multiple minutes for the server to reboot
> etc).
>
> Avati
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>