[Gluster-devel] AFR conservative merge portability
Emmanuel Dreyfus
manu at netbsd.org
Sat Dec 13 14:38:03 UTC 2014
Hello
On NetBSD, tests/basic/afr/entry-self-heal.t always fail on this
scenario:
mkdir spb_heal
kill brick brick0
touch spb_heal/0
glusterfs volume start force
kill_brick brick1
touch spb_heal/1
glusterfs volume start force
At that time, conservative merge takes off and copy spb_heal/0 and
spb_heal/1 in each brick where it is missing. That works, but on NetBSD
we are left with AFR xattr on spb_heal directory telling each brick
accuses the other for metadata. This metadata split brain that will not
self heal.
This happens because after adding an entry, parent directory (spb_heal
here) mtime/ctime must be updated. On Linux, it seems the filesystem is
responsible for that. On NetBSD, the kernel filesyste-independant code
takes care of it and will send a SETATTR to update ctime/mtime on parent
directory.
So when we touch spb_heal/0 and spb_heal/1, the NetBSD kernel sends a
SETATTR for spb_heal ctime/mtine, and since the other brick is down,
here is our metadata split brain.
In http://review.gluster.org/9267, Krutika Dhananjay fixes the test by
clearing AFR xattr to remove the split brain state, but while it let the
test pass, it does not address the real world problem that will leave
metadata split brain that does not self heal.
Here is a proposal: we know that at the end of conservative merge, we
should end up with the situation where directory ctime/mtime is the
ctime of the most recently added children. And fortunately, as
conservative merge happens, parent directory ctime/mtime are updated on
each child addition, and we finish in the desired state.
In other words, after conservative merge, parent directory metadata
split brain for only ctime/mtime can just be cleared by AFR without any
harm.
Does it looks reasonable? Any opinion?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org
More information about the Gluster-devel
mailing list