[Gluster-users] GlusterFS as virtual machine storage
Gionatan Danti
g.danti at assyoma.it
Fri Aug 25 21:08:14 UTC 2017
Il 25-08-2017 21:48 WK ha scritto:
> On 8/25/2017 12:56 AM, Gionatan Danti wrote:
>
>
> We ran Rep2 for years on 3.4. It does work if you are really,really
> careful, But in a crash on one side, you might have lost some bits
> that were on the fly. The VM would then try to heal.
> Without sharding, big VMs take a while because the WHOLE VM file has
> to be copied over. Then you might get Split-brain and have to stop the
> VM, pick the good one, make sure that is healed on both sides and then
> restart the VM.
Ok, so sharding needs to be enabled for VM disk storage, otherwise heal
time skyrockets.
> Arbiter/Replica 3 prevents that. Sharding helps a lot as well by
> making the heals really quick, though in a Replica 2 with sharding you
> no longer have a nice big .img file sitting on each brick in plain
> view and picking a split-brain winner is now WAY more complicated. You
> would have to re-assemble things.
This concern me, and it is the reason I would like to avoid sharding.
How can I recover from such a situation? How can I "decide" which
(reconstructed) file is the one to keep rather than to delete?
>
> We were quite good and fixing broken Gluster 3.4 nodes, but we are
> *much* happier with the Arbiter node and sharding. It is a huge
> difference.
> We could go to Rep3 but we like the extra speed and we are comfortable
> with the Arb limitations (we also have excellent off cluster backups
> <grin>).
>
>
>> Also, on a two-node setup it is *guaranteed* for updates to one node
>> to put offline the whole volume?
>
> If you still have quorum turned on, then yes. One side goes and you are
> down.
>
>> On the other hand, a 3-way setup (or 2+arbiter) if free from all these
>> problems?
>>
>
> Yes, you can lose one of the three nodes and after the pause,
> everything just continues. If you have a second failure before you can
> recover, then you have lost quorum.
>
> If that second failure is the other actual replica, then you could get
> into a situation where the arbiter isn't happy with either copy when
> you come back up and of course the arbiter doesn't have a good copy
> itself. Pavel alluded to something like that when describing his
> problem.
>
> That is where replica 3 helps. In theory, with replica 3, you could
> lose 2 nodes and still have a reasonable copy of your VM, though
> you've lost quorum and are still down. At that point, *I* would kill
> the two bad nodes (STOMITH) to prevent them from coming back AND turn
> off quorum. You could then run on the single node until you can
> save/copy those VM images, preferably by migrating off that volume
> completely. Create a remote pool using SSHFS if you have nothing else
> available. THEN I would go back and fix the gluster cluster and
> migrate back into it.
>
> Replica2/Replica3 does not matter if you lose your Gluster network
> switch, but again the Arb or Rep3 setup makes it easier to recover. I
> suppose the only advantage of Replica2 is that you can use a cross
> over cable and not worry about losing the switch, but bonding/teaming
> works well and there are bonding modes that don't require the same
> switch for the bond slaves. So you can build in some redundancy there
> as well.
Thank you for the very valuable informations.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8
More information about the Gluster-users
mailing list