[Gluster-users] Recovering from Arb/Quorum Write Locks

Mon May 29 04:24:46 UTC 2017

On 05/29/2017 03:36 AM, W Kern wrote:
> So I have testbed composed of a simple 2+1 replicate 3 with ARB testbed.
>
> gluster1, gluster2 and gluster-arb (with shards)
>
> My testing involves some libvirt VMs running continuous write fops on 
> a localhost fuse mount on gluster1
>
> Works great when all the pieces are there. Once I figured out the 
> shard tuning, I was really happy with the speed, even with the older 
> kit I was using for the testbed. Sharding is a huge win.
>
> So for Failure testing I found the following:
>
> If you take down the ARB, the VMs continue to run perfectly and when 
> the ARB returns it catches up.
>
> However, if you take down Gluster2 (with the ARB still being up) you 
> often (but not always) get a write lock on one or more of the VMs, 
> until Gluster2 recovers and heals.
>
> Per the Docs, this Write Lock is evidently EXPECTED behavior with an 
> Arbiter to avoid a Split-Brain.
This happens only if gluster2 had previously witnessed some writes that 
the  gluster1 hadn't.
>
> As I understand it, if the Arb thinks that it knows about (and agrees 
> with) data that exists on Gluster2 (now down) that should be written 
> to Gluster1, it will write lock the volume because the ARB itself 
> doesn't have that data and going forward is problematic until 
> Gluster2's data  is back in the cluster and can bring the volume back 
> into proper sync.

Just to elaborate further, if all nodes were up to begin with and there 
were zero self-heals pending, and you only brought down only gluster2, 
writes must still be allowed. I guess in your case, there must be some 
pending heals from gluster2 to gluster1 before you brought gluster2 down 
due to a network disconnect from the fuse mount to gluster1.

>
> OK, that is the reality of using an Rep2 + ARB versus a true Rep3 
> environment. You get Split-Brain protection but not much increase in 
> HA over old school Replica 2.
>
> So I have some questions:
>
> a) In the event that gluster2 had died and we have entered this write 
> lock phase, how does one go forward if the Gluster2 outage can't be 
> immediately (or remotely) resolved?
>
> At that point I have some hung VMs and annoyed users.
>
> The current quorum settings are:
>
> # gluster volume get VOL all | grep 'quorum'
> cluster.quorum-type                     auto
> cluster.quorum-count                    2
> cluster.server-quorum-type              server
> cluster.server-quorum-ratio             0
> cluster.quorum-reads                    no
>
> Do I simply kill the quorum and and the VMs will continue where they 
> left off?
>
> gluster volume set VOL cluster.server-quorum-type none
> gluster volume set VOL cluster.quorum-type none
>
> If I do so, should I also kill the ARB (before or after)? or leave it up
>
> Or should I switch to quorum-type fixed with a quorum count of 1?
>

All of this is not recommended because you would risk getting the files 
into split-brain.

> b) If I WANT to take down Gluster2 for maintenance, how do I prevent 
> the quorum write-lock from occurring.
>
> I suppose I could fiddle with the quorum settings as above, but I'd 
> like to be able to PAUSE/FLUSH/FSYNC the Volume before taking down 
> Gluster2, then unpause and let the volume continue with Gluster1 and 
> the ARB providing some sort of protection and to help when Gluster2 is 
> returned to the cluster.
>

I think you should try to find if there were self-heals pending to 
gluster1 before you brought gluster2 down or the VMs should not have paused.

> c) Does any of the above behaviour change when I switch to GFAPI
It shouldn't.

Thanks,
Ravi
>
> Sincerely
>
> -bill
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users