[Gluster-devel] About split-brain-resolution.t

Fri Apr 10 06:58:18 UTC 2015

Please find response inline. 

----- Original Message -----
> From: "Anuradha Talur" <atalur at redhat.com>
> To: "Emmanuel Dreyfus" <manu at netbsd.org>
> Cc: gluster-devel at gluster.org
> Sent: Wednesday, April 8, 2015 12:23:34 PM
> Subject: Re: [Gluster-devel] About split-brain-resolution.t
> 
> 
> 
> ----- Original Message -----
> > From: "Emmanuel Dreyfus" <manu at netbsd.org>
> > To: "Anuradha Talur" <atalur at redhat.com>, "Pranith Kumar Karampuri"
> > <pkarampu at redhat.com>
> > Cc: gluster-devel at gluster.org
> > Sent: Tuesday, March 31, 2015 9:55:24 PM
> > Subject: Re: [Gluster-devel] About split-brain-resolution.t
> > 
> > Anuradha Talur <atalur at redhat.com> wrote:
> > 
> > > 1) I send a patch today to revert the .t and send it again along with the
> > > fix.
> > > Or...
> > > 2) Can this be failure be ignored till the fix is merged in?
> > 
> > We can ignore: NetBSD regresssion skips the test for now.
> Hi,
> I've sent a patch to fix this issue, it is currently being reviewed :
> http://review.gluster.org/#/c/10134/ .
> > 
> > --
> > Emmanuel Dreyfus
> > http://hcpnet.free.fr/pubz
> > manu at netbsd.org
> > 
> 
Hi all,

Here's a solution for the regression failure:

The Feature:
1) User performs the following command on a file that is in split-brain state to be able to access it:
   # setfattr -n replica.split-brain-choice -v "subvolX" <path-to-file>
   where subvolX denotes afr-child e.g, volname-client-0
   NOTE: The use of the above command is NOT to pro-actively heal the file. It is only to allow the user to read the file (from each brick, one at a time) to enable him to decide the "right copy".
2) This subvol info is stored in inode_ctx. Reads are served from this subvol as long as the ctx exists.

This info of split-brain-choice is stored in memory rather than on disk for the following reasons:
A) Trying to set them on disk may again result in split-brain.
B) This feature is provided from the mount. Hence, increase in the number of clients performing resolution on the same file will result in a large number of xattrs on the file.

Here is the link that explains the feature further:
https://github.com/gluster/glusterfs/blob/master/doc/features/heal-info-and-split-brain-resolution.md#resolution-of-split-brain-from-the-mount-point .

The Problem:
When replica.split-brain choice is set by the user and the inode is lost/evicted/forgotten from the table, the ctx ceases to exist resulting in a failure.

The proposed solution:
Provide a user-configurable timeout. The inode ctx will be valid for 'timeout' seconds; and the user can now access the file for those many seconds.
The user sets the timeout using following command:
   # setfattr -n replica.resolution-timeout -v <x-seconds> <mount>
This is a global timeout per mount, applicable to all the files.

Any suggestions or comments about the approach are welcome.

> --
> Thanks,
> Anuradha.
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 

-- 
Thanks,
Anuradha.