[Gluster-devel] Automated split-brain resolution

Fri Aug 8 07:39:47 UTC 2014

On Thu, Aug 7, 2014 at 1:35 AM, Ravishankar N <ravishankar at redhat.com> wrote:
>
> Manual resolution of split-brains [1] has been a tedious task involving
> understanding and modifying AFR's changelog extended attributes. To simplify
> and to an extent automate this task, we are proposing a new CLI command with
> which the user can  specify  what the source brick/file is, and
> automatically heal the files in the appropriate direction.
>
> Command: gluster volume resolve-split-brain <VOLNAME> {<bigger_file>  |
> source-brick <brick_name> [<file>] }
>
> Breaking up the command into its possible options, we have:
>
> a) gluster volume resolve-split-brain <VOLNAME> <bigger_file>
> When this command is executed, AFR will consider the brick having the
> highest file size as the source and heal it to all other bricks (including
> all other sources and sinks) in that replica subvolume. If the file size is
> same in all the bricks, it does *not* heal the file.
>
> b) gluster volume resolve-split-brain <VOLNAME > source-brick <brick_name >
> [<file>]
>
> When this command is executed, if <file> is specified, AFR heals the file
> from the source-brick <brick_name> to all other bricks of that replica
> subvolume. For resolving multiple files, the command must be run
> iteratively, once per file.
> If <file> is not specified, AFR heals all the files that have an entry in
> .glusterfs/indices/xattrop *and* are in split-brain. As before, heals happen
> from source-brick <brick_name> to all other bricks.
>
> Future work could also include extending the command to add other policies
> like choosing the file having the latest mtime as the source, integration
> with trash xlator wherein the files deleted from the sink are moved to the
> trash dir etc.
>

I have a few queries regarding the overall design itself.

Here are the caveats

   - Adding a new option rather than extending an existing option
'gluster volume heal'.
   - Asking user to input the filename which is not necessary as
default since such files are already
     available through the 'gluster volume heal <volname> info split-brain'

What would be ideal is the following making it seamless and much more
user friendly

Extend the existing CLI as following

 - 'gluster volume heal <volname> split-brain'

Healing split-brained files is more palpable and has a rather more
convincing tone for a sys-admin IMHO.

An example version of this extension would be.

'gluster volume heal <volname> split-brain [<file>|<gfid as canonical form>]

In-fact since we already know the list of split-brained files we can
just loop through them and ask interactive questions

# gluster volume heal <volname> split-brain
WARNING: About to start fixing split brained files on an active
GlusterFS volume, do you wish to proceed? y

WARNING: files removed would be actively backed up in '.trash' under
your brick path for future recovery.
...
WARNING: Found 1000 files in split brain
...
File on pair 'host1:host2' is in split brain, file with latest
time-stamp found on host1 - Fix? y
File on pair 'host3:host5' is in split brain. file with biggest size
found on host5 - Fix? y
....
....
....
....
************ Fixed (1000 split brain files) ************

# gluster volume heal <volname> split-brain
INFO: no split brains present on this <volume>

The real pain point of fixing the split brain is not taking getfattr
outputs and figuring out what is the file under conflict, the real
pain point is doing the gfid to the actual file translation when there
are millions of files. Gathering this list takes more time than
actually fixing the split brain and i have personally spent countless
hrs doing these.

Now this list is easily available to GlusterFS and also its gfid to
path translation - why isn't it simple enough for us to ask the user
what we think is the right choice - we do certainly know which is the
bigger file too.

My general contention is when we know what is the right thing to do
under certain conditions we should be making it easier for example:
Directory metadata split brains - we just fix it automatically today
but certainly wasn't the case in the past. We learnt to do the right
thing when its necessary from experience.

A greater UI experience make it really 'automated' as you intend to
do, to make larger decisions ourselves and users are left with simple
choices to be made so that its not confusing.

-- 
Religious confuse piety with mere ritual, the virtuous confuse
regulation with outcomes