[Gluster-users] Avoiding Split Brains

Fri Oct 30 12:58:01 UTC 2015

Yes, you need to avoid split brain on a two node replica=2 setup. You
can just add a third node with no bricks which serves as the arbiter
and set quorum to 51%.

If you set quorum to 51% and do not have more than 2 nodes, then when
one goes down all your gluster mounts become unavailable (or is it
just read only?). If you run VMs on top of this then you usually end
up with paused/frozen vms until the volume becomes available again.

These are RH specific docs, but may help:

https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.0/html/Administration_Guide/sect-User_Guide-Managing_Volumes-Quorum.html

https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Managing_Split-brain.html

First time in testing I hit split brain, I found these blog very useful:

https://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/

HTH,

Diego

On Fri, Oct 30, 2015 at 8:46 AM, Iain Milne <glusterfs at noognet.org> wrote:
> Anyone?
>
>> -----Original Message-----
>> From: gluster-users-bounces at gluster.org [mailto:gluster-users-
>> bounces at gluster.org] On Behalf Of Iain Milne
>> Sent: 21 October 2015 09:23
>> To: gluster-users at gluster.org
>> Subject: [Gluster-users] Avoiding Split Brains
>>
>> Hi all,
>>
>> We've been running a distributed setup for 3 years with no issues.
>> Recently we switched to a 2-server, replicated setup (soon to be a 4
>> servers) and keep encountering what I assume are split-brain situations,
>> eg:
>>
>>     Brick server1:/brick
>>     <gfid:85893940-63a8-4fa3-bf83-9e894fe852c7>
>>     <gfid:8b325ef9-a8d2-4088-a8ae-c73f4b9390fc>
>>     <gfid:ed815f9b-9a97-4c21-86a1-da203b023cda>
>>     <gfid:7fdbd6da-b09d-4eaf-a99b-2fbe889d2c5f>
>>     ...
>>     Number of entries: 217
>>
>>     Brick server2:/brick
>>     Number of entries: 0
>>
>> a) What does this mean?
>> b) How do I go about fixing it?
>>
>> And perhaps more importantly, how to I avoid this happening in the future?
>> Not once since moving to replication has either of the two servers been
> offline
>> or unavailable (to my knowledge).
>>
>> Is some sort of server/client quorum needed (that I admit I don't fully
>> understand)? While high-availability would be nice to have, it's not
> essential -
>> robustness of the data is.
>>
>> Thanks
>>
>> Iain
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users