[Gluster-users] Recovering from Arb/Quorum Write Locks
Ravishankar N
ravishankar at redhat.com
Mon May 29 04:24:46 UTC 2017
On 05/29/2017 03:36 AM, W Kern wrote:
> So I have testbed composed of a simple 2+1 replicate 3 with ARB testbed.
>
> gluster1, gluster2 and gluster-arb (with shards)
>
> My testing involves some libvirt VMs running continuous write fops on
> a localhost fuse mount on gluster1
>
> Works great when all the pieces are there. Once I figured out the
> shard tuning, I was really happy with the speed, even with the older
> kit I was using for the testbed. Sharding is a huge win.
>
> So for Failure testing I found the following:
>
> If you take down the ARB, the VMs continue to run perfectly and when
> the ARB returns it catches up.
>
> However, if you take down Gluster2 (with the ARB still being up) you
> often (but not always) get a write lock on one or more of the VMs,
> until Gluster2 recovers and heals.
>
> Per the Docs, this Write Lock is evidently EXPECTED behavior with an
> Arbiter to avoid a Split-Brain.
This happens only if gluster2 had previously witnessed some writes that
the gluster1 hadn't.
>
> As I understand it, if the Arb thinks that it knows about (and agrees
> with) data that exists on Gluster2 (now down) that should be written
> to Gluster1, it will write lock the volume because the ARB itself
> doesn't have that data and going forward is problematic until
> Gluster2's data is back in the cluster and can bring the volume back
> into proper sync.
Just to elaborate further, if all nodes were up to begin with and there
were zero self-heals pending, and you only brought down only gluster2,
writes must still be allowed. I guess in your case, there must be some
pending heals from gluster2 to gluster1 before you brought gluster2 down
due to a network disconnect from the fuse mount to gluster1.
>
> OK, that is the reality of using an Rep2 + ARB versus a true Rep3
> environment. You get Split-Brain protection but not much increase in
> HA over old school Replica 2.
>
> So I have some questions:
>
> a) In the event that gluster2 had died and we have entered this write
> lock phase, how does one go forward if the Gluster2 outage can't be
> immediately (or remotely) resolved?
>
> At that point I have some hung VMs and annoyed users.
>
> The current quorum settings are:
>
> # gluster volume get VOL all | grep 'quorum'
> cluster.quorum-type auto
> cluster.quorum-count 2
> cluster.server-quorum-type server
> cluster.server-quorum-ratio 0
> cluster.quorum-reads no
>
> Do I simply kill the quorum and and the VMs will continue where they
> left off?
>
> gluster volume set VOL cluster.server-quorum-type none
> gluster volume set VOL cluster.quorum-type none
>
> If I do so, should I also kill the ARB (before or after)? or leave it up
>
> Or should I switch to quorum-type fixed with a quorum count of 1?
>
All of this is not recommended because you would risk getting the files
into split-brain.
> b) If I WANT to take down Gluster2 for maintenance, how do I prevent
> the quorum write-lock from occurring.
>
> I suppose I could fiddle with the quorum settings as above, but I'd
> like to be able to PAUSE/FLUSH/FSYNC the Volume before taking down
> Gluster2, then unpause and let the volume continue with Gluster1 and
> the ARB providing some sort of protection and to help when Gluster2 is
> returned to the cluster.
>
I think you should try to find if there were self-heals pending to
gluster1 before you brought gluster2 down or the VMs should not have paused.
> c) Does any of the above behaviour change when I switch to GFAPI
It shouldn't.
Thanks,
Ravi
>
> Sincerely
>
> -bill
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list