[Gluster-users] [Gluster-devel] AFR: Fail lookups when quorum not met
Ravishankar N
ravishankar at redhat.com
Mon Oct 9 12:43:49 UTC 2017
On 09/22/2017 07:27 PM, Niels de Vos wrote:
> On Fri, Sep 22, 2017 at 12:27:46PM +0530, Ravishankar N wrote:
>> Hello,
>>
>> In AFR we currently allow look-ups to pass through without taking into
>> account whether the lookup is served from the good or bad brick. We always
>> serve from the good brick whenever possible, but if there is none, we just
>> serve the lookup from one of the bricks that we got a positive reply from.
>>
>> We found a bug [1] due to this behavior were the iatt values returned in
>> the lookup call was bad and caused the client to hang. The proposed fix [2]
>> was to fail look ups when we definitely know the lookup can't be trusted (by
>> virtue of AFR xattrs indicating the replies we got from the up bricks are
>> indeed bad).
>>
>> Note that this fix is *only* for replica 3 or arbiter volumes (not replica
>> 2, where there is no notion of quorum). But we want to 'harden' the fix by
>> not allowing any look ups at all if quorum is not met (or) it is met but
>> there are no good copies.
>>
>> Some implications of this:
>>
>> -If a file ends up in data/meta data split-brain in replica 3/arbiter (rare
>> occurrence), we won't be able to delete it from the mount.
>>
>> -Even if the only brick that is up is the good copy, we still fail it due to
>> lack of quorum.
>>
>> Does any one have comments/ feedback?
> I think additional improvements for correctness outweigh the two
> negative side-effects that you listed.
Thanks for the feedback Niels. Since we haven't received any other
inputs, I will rework my patch to include the changes for correctness.
>
> Possibly the 2nd point could get some confusion from users. "it always
> worked before" may be a reason to add a volume option for this? That is
> something you can consider, but if you deem that overkill then I'm ok
> with that too.
Yeah, vol option is an overkill IMO. Once merged in master, I am
thinking of having this fix only in 3.13.0 and calling it out explicitly
(and not back-port to a 3.12 minor release) to mitigate confusion to a
certain extent.
Regards,
Ravi
>
> Thanks,
> Niels
>
>
>> Thanks,
>>
>> Ravi
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1467250
>>
>> [2] https://review.gluster.org/#/c/17673/ (See review comments on the
>> landing page if interested)
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-users
mailing list