[Gluster-devel] Automated split-brain resolution

Harshavardhana harsha at harshavardhana.net
Tue Aug 12 05:59:08 UTC 2014


> This is a standard problem where there are split-brains in distributed
> systems. For example even in git there are cases where it gives up asking
> users to fix the file i.e. merge conflicts. If the user doesn't want
> split-brains they should move to replica-3 and enable client-quorum. But if
> the user made a conscious decision to live with split-brain problems
> favouring availability/using replica-2, then split-brains do happen and it
> needs user intervention. All we are trying to do is to make this process a
> bit painless by coming up with meaningful policies.
>

Agreed, split brains do require manual intervention no one argues about that,
but it shouldn't be quite as tedious as GlusterFS wants it to be.

I do agree that it is way simpler than perhaps some other distributed
filesystems but at
any point we ask some one to write a script to fix our internal
structure - that is not a
feature its a bug.

We all appreciate the effort, but my wish we incorporate some pain
points which we have
seen personally over the years and fix it right when we are at it.

> If the user knows his workload is append only and there are split-brains the
> only command he needs to execute is:
> 'gluster volume heal <volname> split-brain bigger-file'
> no grep, no finding file paths, nothing.
>

Adding to this - we need to provide additional sanity checks that
split brains were
indeed fixed - since this looks quite destructive operation, are you
planning a rollback
at any point during this process?

> There were also instances where the user knows the brick he/she would like
> to be the source but he/she is worried that old brick which comes back up
> would cause split-brains so he/she had to erase the whole brick which was
> down and bring it back up.
> Instead we can suggest him/her to use 'gluster volume heal <VOLNAME>
> split-brain source-brick <brick_name>' after bringing the brick back up so
> that not all the contents needs to be healed.
> 1) gluster volume heal <volname> info split-brain should give output in some
> 'format' giving stat/pending-matrix etc for all the files in split-brain.
>   - Unfortunately we still don't have a way to provide with file paths
> without doing 'find' on the bricks.

Critical setups require fixing split-brain with quick turn around no
one really has the
luxury running a find on a large volume. So i still do not understand,
if a 'find' can do
a gfid --> inum --> path - how hard it is for Gluster management
daemon to know this?
just to provide better tooling?

-- Harsha


More information about the Gluster-devel mailing list