[Gluster-users] Avoid Split-brain and other stuff

Martin Emrich martin.emrich at empolis.com
Wed Nov 14 14:29:28 UTC 2012


I just gave GlusterFS a try and experienced two problems. First some background:

-       I want to set up a file server with synchronous replication between branch offices, similar to Windows DFS-Replication. The goal is _not_ high-availability or cluster-scaleout, but just having all files locally available at each branch office.

-       To test GlusterFS, I installed two virtual machines in different locations, Ubuntu 12.04, with the GlusterFS 3.3 packages from the PPA.

-       Both machines shall be server and and client, and export the GlusterFS volume via samba.

-       I set up a file system in replica mode according to the quick start guide (except that I used ext4 instead of xfs for the brick, I had bad experiences with xfs)

-       I mounted the filesystem on both machines as localhost:/gv0, and shared the mount via samba.

At first it seemed to work fine (Copying files from/to the share, files appear instantly on the other host), until I did some robustness tests:

I severed the connection between the two hosts to provoke a split-brain scenario, just to see what happens. I expected both hosts to work, but on one of them the GlusterFS volume froze. After restarting the glusterfs-server service, it came back.

Then I intentionally created a conflicting file on each host.
After reconnecting the host, I got "Input/Output error" on both the conflicting file and the volume root inode. I found this http://blog.oneiroi.co.uk/linux/gluster-resolving-a-split-brain-in-a-replicated-setup/  which fixed it for me, but having to manually fix the filesystem whenever a branch office link goes down does not feel very trustworthy. Is there some auto-conflict-resolving feature (last one wins, or renaming conflicting files)?

Then I took a look at the performance, and copied an ISO image (~ 700MB) to the filesystem. Worked fine, until I tried to md5sum it from both hosts. While the one node took a few seconds (what I expected), the other one took several minutes. Then I found out that it read the file over the WAN link from the distant host instead from itself. It should have had time enough (one hour) to replicate the file across both hosts...

(By the way, I also wanted to try geo-replication (which might suffice for my needs with a tight-enough schedule), but I was not able to create a volume with only one brick...

So I wonder: What did I do wrong?



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121114/1ad5a1ca/attachment.html>

More information about the Gluster-users mailing list