[Gluster-users] frequent split-brain detected, aborting selfheal; background meta-data self-heal failed

Wed Jan 9 12:28:06 UTC 2013

On Tue, Jan 08, 2013 at 03:02:19PM -0500, Jeff Darcy wrote:
> [1] http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/
> [2] http://hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-present/

These are helpful articles, thank you.

It seems to me that the main risk of interleaving data and xattr (inode)
updates like this, to maintain state about what does or does not need to be
replicated, is that when you pull the plug, the data updates may have been
committed to disk and not the inode updates, or vice versa.  In the absence
of explicit fsync/fdatasync instructions, the OS may make its own decisions
about when to flush dirty data blocks and/or inodes; and in any case the
drive may re-order stacked write operations.

It may be possible to have some strong ordering guarantee if you use a
journalling filesystem and configure it to journal both data and inode
writes.  This is not a normal filesystem configuration though, nor does the
gluster documentation suggest you use it (AFAIK).

I should add this is all without knowing the full details of what AFR and
DHT are doing (but even you admit to not knowing the full details of
self-healing :-).  It just seems to me quite possible that the system could
get out of step this way.

Regards,

Brian.