[Gluster-users] Is it required for a node to meet quorum over all the nodes in storage pool?

Jeevan Patnaik g1patnaik at gmail.com
Fri Jan 25 08:10:50 UTC 2019


Hi,

I'm just going through the concepts of quorum and split-brains with a
cluster in general, and trying to understand GlusterFS quorums again which
I previously found difficult to accurately understand.

When we talk about server quorums, what I understand is that the concept is
similar to STONITH in cluster i.e., we shoot the node that probably have
issues/ make the bricks down preventing access at all. But I don't get how
it calculates quorum.

My understanding:
In a distributed replicated volume,
1. All bricks in a replica set should have same data writes and hence, it
is required to meet atleast 51% quorum on those replica sets. Now
considering following 3x replica configuration:
ServerA,B,C,D,E,F-> brickA,B,C,D,E,F respectively and serverG without any
brick in storage pool.

Scenario:
ServerA,B,F formed a partition i.e., they are isolated with other nodes in
storage pool.

But serverA,B,C bricks are of same sub-volume, Hence if we consider quorum
over sub-volumes, A and B meets quorum for it's only participating
sub-volume and can serve the corresponding bricks. And the corresponding
bricks on C should go down.

But when we consider quorum over storage pool, C,D,E,G meets quorum whereas
A,B,F is not. Hence, bricks on A,B,F should fail. And for C, the quorum
still will not me met for it's sub-volume. So, it will go to read only
mode. Sub-volume on D and E should work normally.

So, with assumption that only sub-volume quorum is considered, we don't
have any downtime on sub-volumes, but we have two partitions and if clients
can access both, clients can still write and read on both the partitions
separately and without data conflict. The split-brain problem arrives when
some clients can access one partition and some other.

If quorum is considered for entire storage pool, then this split-brain will
not be seen as the problem nodes will be dead.

And so why is it's not mandatory to enable server quorum to avoid this
split-brain issue?

And I also assume that quorum percentage should be greater than 50%.
There's any option to set custom percentage. Why is it required?
If all that is required is to kill the problem node partition (group) by
identifying if it has the largest possible share (i.e. greater than 50),
does the percentage really matter?

Thanks in advance!

Regards,
Jeevan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190125/295123d2/attachment.html>


More information about the Gluster-users mailing list