[Gluster-users] AFR: Fail lookups when quorum not met

Ravishankar N ravishankar at redhat.com
Fri Sep 22 06:57:46 UTC 2017


In AFR we currently allow look-ups to pass through without taking into 
account whether the lookup is served from the good or bad brick. We 
always serve from the good brick whenever possible, but if there is 
none, we just serve the lookup from one of the bricks that we got a 
positive reply from.

We found a bug  [1] due to this behavior were the iatt values returned 
in the lookup call was bad and caused the client to hang. The proposed 
fix [2] was to fail look ups when we definitely know the lookup can't be 
trusted (by virtue of AFR xattrs indicating the replies we got from the 
up bricks are indeed bad).

Note that this fix is *only* for replica 3 or arbiter volumes (not 
replica 2, where there is no notion of quorum). But we want to 'harden' 
the fix by  not allowing any look ups at all if quorum is not met (or) 
it is met but there are no good copies.

Some implications of this:

-If a file ends up in data/meta data split-brain in replica 3/arbiter 
(rare occurrence), we won't be able to delete it from the mount.

-Even if the only brick that is up is the good copy, we still fail it 
due to lack of quorum.

Does any one have comments/ feedback?



[1] https://bugzilla.redhat.com/show_bug.cgi?id=1467250

[2] https://review.gluster.org/#/c/17673/ (See review comments on the 
landing page if interested)

