Possible split-brain

Jeff Darcy jdarcy at redhat.com
Thu Nov 11 19:46:45 UTC 2010

On Thu, 2010-11-11 at 16:00 +0000, Aaron Roberts wrote:

> The platform is not currently running production data and I have been testing the redundancy of the setup (pulling cables etc.).  All my servers are now logging the following messages every 1 minute or so:
> [2010-11-11 14:18:49.636327] I [afr-common.c:672:afr_lookup_done] datastore-replicate-0: split brain detected during lookup of /.
> [2010-11-11 14:18:49.636388] I [afr-common.c:716:afr_lookup_done] datastore-replicate-0: background  meta-data data self-heal triggered. path: /
> [2010-11-11 14:18:49.636863] E [afr-self-heal-metadata.c:524:afr_sh_metadata_fix] datastore-replicate-0: Unable to self-heal permissions/ownership of '/' (possible split-brain). Please fix the file on all backend volumes

Can you do a "getfattr -d -e hex -m trusted.afr $path" on the path for
each brick's root directory (server side)?  There seem to be a few
different ways for the split-brain flag to be set, all having to do with
the contents of these xattrs.  The solution might be to clear them, but
it would be good to see what the values are and have someone closer to
the AFR code than I am determine exactly which case we're in.

