[Gluster-users] GlusterFS 3.3.1 split-brain rsync question

Pete Smith pete at realisestudio.com
Wed Apr 10 15:41:57 UTC 2013

Hi Dan

I've come up against this recently whilst trying to delete large amounts of
files from our cluster.

I'm resolving it with the method from

With Fabric as a helping hand, it's not too tedious.

Not sure about the level of glustershd compatibiity, but it's working for



On 10 April 2013 11:44, Daniel Mons <daemons at kanuka.com.au> wrote:

> Our production GlusterFS 3.3.1GA setup is a 3x2 distribute-replicate,
> with 100TB usable for staff.  This is one of 4 identical GlusterFS
> clusters we're running.
> Very early in the life of our production Gluster rollout, we ran
> Netatalk 2.X to share files with MacOSX clients (due to slow negative
> lookup on CIFS/Samba for those pesky resource fork files in MacOSX's
> Finder).  Netatalk 2.X wrote it's CNID_DB files back to Gluster, which
> caused enormous IO, locking up many nodes at a time (lots of "hung
> task" errors in dmesg/syslog).
> We've since moved to Netatalk 3.X which puts its CNID_DB files
> elsewhere (we put them on local SSD RAID), and the lockups have
> vanished.  However, our split-brain files number in the tens of
> thousands to to those previous lockups, and aren't always predictable
> (i.e.: it's not always the case where brick0 is "good" and brick1 is
> "bad").  Manually fixing the files is far too time consuming.
> I've written a rudimentary script that trawls
> /var/log/glusterfs/glustershd.log for split-brain GFIDs, tracks it
> down on the matching pair of bricks, and figures out via a few rules
> (size tends to be a good indicator for us, as bigger files tend to be
> more rencent ones) which is the "good" file.  This works for about 80%
> of files, which will dramatically reduce the amount of data we have to
> manually check.
> My question is: what should I do from here?  Options are:
> Option 1) Delete the file from the "bad" brick
> Option 2)  rsync the file from the "good" brick to the "bad" brick
> with -aX flag (preserve everything, including trusted.afr.$server and
> trusted.gfid xattrs)
> Option 3) rsync the file from "good" to "bad", and then setfattr -x
> trusted.* on the bad brick.
> Which of these is considered the better (more glustershd compatible)
> option?  Or alternatively, is there something else that's preferred?
> Normally I'd just test this on our backup gluster, however as it was
> never running Netatalk, it has no split-brain problems, so I can't
> test the functionality.
> Thanks for any insight provided,
> -Dan
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

Pete Smith
DevOp/System Administrator
Realise Studio
12/13 Poland Street, London W1F 8QB
T. +44 (0)20 7165 9644

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130410/28a0bcd0/attachment.html>

More information about the Gluster-users mailing list