[Gluster-devel] Automated split-brain resolution
Joe Julian
joe at julianfamily.org
Mon Aug 11 07:27:00 UTC 2014
On 08/10/2014 11:42 PM, Ravishankar N wrote:
> On 08/09/2014 01:23 AM, Joe Julian wrote:
>> Thinking about it more, I'd still rather have this functionality
>> exposed at the client through xattrs. For 5 years I've thought about
>> this, and the more I encounter split-brain, the more I think this is
>> the needed approach.
>>
>
> Joe, why do you feel resolving split-brains should be exposed to
> clients? Whatever approach is taken (either a gluster CLI command or
> an overloaded get/satfattr call, is it not better to have this done at
> the server side?)
>
* It's consistent with the way other functions actually operate,
rebalance, self-heal, etc. In that they're really just clients.
* On the client it offers more possibilities for us admins to be able to
fix something on the fly.
* It's an API at that point. Software could be coded to perform its own
self-heal based on the rules that might apply to that particular use case.
* If multi-tenancy is ever added, it is a method by which the tenant can
repair his own files.
It was late, last time, and I missed one important operation. The
ability to mv one copy of the split-brain to a new filename in case you
choose wrongly and need it. I've seen that with VM images. Typically, it
doesn't really matter which VM image you chose (if your data's in a
smart place instead of on the image). Pick either one and boot it back
up. Occasionally, though, the image is irreparable. Frequently, the
"other copy" is ok, so if one fails to boot, we swap to the other.
>
>> "getfattr -n trusted.glusterfs.stat" returns
>> xml/json/some_madeup_datastructure with the results of stat from each
>> brick
>> "getfattr -n trusted.glusterfs.afr" returns the afr matrix
>> "setfattr -n trusted.glusterfs.sb-pick -v "server2:/srv/brick1"
>>
>> That gives us the tools we need to choose what to do with any given
>> split-brain. For large swaths of automated repair, we can use find.
>>
>> I suppose that last bit could still be implemented through that cli
>> command.
>>
>>
>> On 08/07/2014 01:35 AM, Ravishankar N wrote:
>>>
>>> Manual resolution of split-brains [1] has been a tedious task
>>> involving understanding and modifying AFR's changelog extended
>>> attributes. To simplify and to an extent automate this task, we are
>>> proposing a new CLI command with which the user can specify what
>>> the source brick/file is, and automatically heal the files in the
>>> appropriate direction.
>>>
>>> Command: gluster volume resolve-split-brain <VOLNAME>
>>> {<bigger_file> | source-brick <brick_name> [<file>] }
>>>
>>> Breaking up the command into its possible options, we have:
>>>
>>> a) gluster volume resolve-split-brain <VOLNAME> <bigger_file>
>>> When this command is executed, AFR will consider the brick having
>>> the highest file size as the source and heal it to all other bricks
>>> (including all other sources and sinks) in that replica subvolume.
>>> If the file size is same in all the bricks, it does *not* heal the
>>> file.
>>>
>>> b) gluster volume resolve-split-brain <VOLNAME > source-brick
>>> <brick_name > [<file>]
>>>
>>> When this command is executed, if <file> is specified, AFR heals the
>>> file from the source-brick <brick_name> to all other bricks of that
>>> replica subvolume. For resolving multiple files, the command must be
>>> run iteratively, once per file.
>>> If <file> is not specified, AFR heals all the files that have an
>>> entry in .glusterfs/indices/xattrop *and* are in split-brain. As
>>> before, heals happen from source-brick <brick_name> to all other
>>> bricks.
>>>
>>> Future work could also include extending the command to add other
>>> policies like choosing the file having the latest mtime as the
>>> source, integration with trash xlator wherein the files deleted from
>>> the sink are moved to the trash dir etc.
>>>
>>> Please give feedback on the above.
>>>
>>> Regards,
>>> Ravi
>>>
>>> [1] https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140811/3c71517a/attachment.html>
More information about the Gluster-devel
mailing list