[Gluster-users] Replication not working on server hang

David Saez Padros david at ols.es
Fri Aug 28 12:28:51 UTC 2009


Hi

well, that never hapen before when using nfs with the same
computers, same disk, etc ... for almost 2 years, so it's more
than possible that is glusterfs the one which is triggering this
suposed ext3 bug, but appart from this:

a) documentation says "All operations that do not modify the file
or directory are sent to all the subvolumes and the first successful
reply is returned to the application", why is blocking then ?
it's suposed that the reply from the non blocked server will
come first and nothing will block, but clients are blocking on
a simple ls operation

b) server1 (the  non blocked one) also has the volumes mounted like
any other client, but having option read-subvolume set to the local
volume, but it also hangs when it was suposed to read from the local
volume, not from the hanged one

c) does not glsuterfs ping the servers periodically to see if they
are available or not ? if so, why does not it detect that situation ?

>> [...]
>> Glusterfs log only shows lines like this ones:
>>
>> [2009-08-28 09:19:28] E [client-protocol.c:292:call_bail] data2: bailing 
>> out frame LOOKUP(32) frame sent = 2009-08-28 08:49:18. frame-timeout = 1800
>> [2009-08-28 09:23:38] E [client-protocol.c:292:call_bail] data2: bailing 
>> out frame LOOKUP(32) frame sent = 2009-08-28 08:53:28. frame-timeout = 1800
>>
>> Once server2 has been rebooted all gluster fs become available
>> again on all clients and the hanged df and ls processes terminate,
>> but difficult to understand why a replicated share that must survive
>> to failure on one server does not.
> 
> You are suffering from the problem we talked about few days ago on the list.
> If your local fs produces a deadlock somehow on one server glusterfs is
> currently unable to cope with the situation and just _waits_ for things to
> come. This deadlocks your clients, too, without any need.
> Your experience backs my critics on the handling of these situations.

-- 
Best regards ...

----------------------------------------------------------------
    David Saez Padros                http://www.ols.es
    On-Line Services 2000 S.L.       telf    +34 902 50 29 75
----------------------------------------------------------------





More information about the Gluster-users mailing list