[Gluster-devel] regression failures on afr/split-brain-resolution

Ravishankar N ravishankar at redhat.com
Wed Jul 25 04:26:19 UTC 2018



On 07/25/2018 09:06 AM, Raghavendra Gowdappa wrote:
>
>
> On Tue, Jul 24, 2018 at 6:54 PM, Ravishankar N <ravishankar at redhat.com 
> <mailto:ravishankar at redhat.com>> wrote:
>
>
>
>     On 07/24/2018 06:30 PM, Ravishankar N wrote:
>>
>>
>>
>>     On 07/24/2018 02:56 PM, Raghavendra Gowdappa wrote:
>>>     All,
>>>
>>>     I was trying to debug regression failures on [1] and observed
>>>     that split-brain-resolution.t was failing consistently.
>>>
>>>     =========================
>>>     TEST 45 (line 88): 0 get_pending_heal_count patchy
>>>     ./tests/basic/afr/split-brain-resolution.t .. 45/45 RESULT 45: 1
>>>     ./tests/basic/afr/split-brain-resolution.t .. Failed 17/45 subtests
>>>
>>>     Test Summary Report
>>>     -------------------
>>>     ./tests/basic/afr/split-brain-resolution.t (Wstat: 0 Tests: 45
>>>     Failed: 17)
>>>       Failed tests:  24-26, 28-36, 41-45
>>>
>>>
>>>     On probing deeper, I observed a curious fact - on most of the
>>>     failures stat was not served from md-cache, but instead was
>>>     wound down to afr which failed stat with EIO as the file was in
>>>     split brain. So, I did another test:
>>>     * disabled md-cache
>>>     * mount glusterfs with attribute-timeout 0 and entry-timeout 0
>>>
>>>     Now the test fails always. So, I think the test relied on stat
>>>     requests being absorbed either by kernel attribute cache or
>>>     md-cache. When its not happening stats are reaching afr and
>>>     resulting in failures of cmds like getfattr etc.
>>
>>     This indeed seems to be the case.  Is there any way we can avoid
>>     the stat? When a getfattr is performed on the mount, aren't
>>     lookup + getfattr are the only fops that need to be hit in gluster?
>
>     Or should afr allow (f)stat even for replica-2 split-brains
>     because it is allowing lookup anyway (lookup cbk contains stat
>     information from one of its children) ?
>
>
> I think the question here should be what kind of access we've to 
> provide for files in split-brain. Once that broader question is 
> answered, we should think about what fops come under those kinds of 
> access. If setfattr/getfattr cmd access has to be provided I guess 
> lookup, stat, setxattr, getxattr need to work with split-brain files.

Ideally, the only fop that should be allowed access is checking whether 
the file exists or not (i.e. lookup), subject to quorum checks. All 
others should be denied. This is how it works as of today too but we 
(afr) overloaded setfattr and getfattr with virtual xattrs to allow 
examining and resolving split-brain from the mount, which is now failing 
in the .t because of the stat failing like you pointed out. I think we 
should allow (f)stat too for replica-2 case even when there are no good 
copies (i.e. read_subvol) to support the mount based split-brain 
resolution method.  Pranith, what do you think?

-Ravi


>
>     -Ravi
>>     -Ravi
>>
>>>     Thoughts?
>>>
>>>     [1] https://review.gluster.org/#/c/20549/
>>>     <https://review.gluster.org/#/c/20549/>
>>>
>>>
>>>     _______________________________________________
>>>     Gluster-devel mailing list
>>>     Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>>     https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>     <https://lists.gluster.org/mailman/listinfo/gluster-devel>
>>
>>
>>
>>     _______________________________________________
>>     Gluster-devel mailing list
>>     Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>     https://lists.gluster.org/mailman/listinfo/gluster-devel
>>     <https://lists.gluster.org/mailman/listinfo/gluster-devel>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180725/b411cd1e/attachment-0001.html>


More information about the Gluster-devel mailing list