[Gluster-users] GlusterFS as virtual machine storage

WK wkmail at bneit.com
Fri Aug 25 19:48:31 UTC 2017



On 8/25/2017 12:56 AM, Gionatan Danti wrote:
>
>
>> WK wrote:
>> 2 node plus Arbiter. You NEED the arbiter or a third node. Do NOT try 2
>> node with a VM
>
> This is true even if I manage locking at application level (via 
> virlock or sanlock)?


We ran Rep2 for years on 3.4.  It does work if you are really,really  
careful,  But in a crash on one side, you might have lost some bits that 
were on the fly. The VM would then try to heal.
Without sharding, big VMs take a while because the WHOLE VM file has to 
be copied over. Then you might get Split-brain and have to stop the VM, 
pick the good one, make sure that is healed on both sides and then 
restart the VM.

Arbiter/Replica 3 prevents that. Sharding helps a lot as well by making 
the heals really quick, though in a Replica 2 with sharding you no 
longer have a nice big  .img file sitting on each brick in plain view 
and picking a split-brain winner is now WAY more complicated. You would 
have to re-assemble things.

We were quite good and fixing broken Gluster 3.4 nodes, but we are 
*much* happier with the Arbiter node and sharding. It is a huge difference.
We could go to Rep3 but we like the extra speed and we are comfortable 
with the Arb limitations (we also have excellent off cluster backups 
<grin>).


> Also, on a two-node setup it is *guaranteed* for updates to one node 
> to put offline the whole volume?

If you still have quorum turned on, then yes. One side goes and you are 
down.

> On the other hand, a 3-way setup (or 2+arbiter) if free from all these 
> problems?
>

Yes, you can lose one of the three nodes and after the pause, everything 
just continues. If you have a second failure before you can recover, 
then you have lost quorum.

If that second failure is the other actual replica, then you could get 
into a situation where the arbiter isn't happy with either copy when you 
come back up and of course the arbiter doesn't have a good copy itself. 
Pavel alluded to something like that when describing his problem.

That is where replica 3 helps. In theory, with replica 3, you could lose 
2 nodes and still have a reasonable copy of your VM, though you've lost 
quorum and are still down. At that point, *I* would kill the two bad 
nodes (STOMITH) to prevent them from coming back AND turn off quorum. 
You could then run on the single node until you can save/copy those VM 
images, preferably by migrating off that volume completely. Create a 
remote pool using SSHFS if you have nothing else available. THEN I would 
go back and fix the gluster cluster and migrate back into it.

Replica2/Replica3 does not matter if you lose your Gluster network 
switch, but again the Arb or Rep3 setup makes it easier to recover. I 
suppose the only advantage of Replica2 is that you can use a cross over 
cable and not worry about losing the switch, but bonding/teaming works 
well and there are bonding modes that don't require the same switch for 
the bond slaves. So you can build in some redundancy there as well.




More information about the Gluster-users mailing list