[Gluster-users] Fail of one brick lead to crash VMs

Tue Feb 16 15:17:27 UTC 2016

Hi guys

Thanks a lot for your help.
I will now update our servers to glusterfs 3.7.8 and then add the 3rd
server as an arbiter.

I will update you after that.

Thanks a lot

Dominique

Werde Teil des modernen Arbeitens im Glarnerland auf www.digitalglarus.ch!
Lese Neuigkeiten auf Twitter: www.twitter.com/DigitalGlarus
Diskutiere mit auf Facebook:  www.facebook.com/digitalglarus

On 02/11/2016 06:03 PM, Krutika Dhananjay wrote:
> Hi Dominique,
> 
> //I saw the logs attached. At some point all bricks seem to have gone
> down as I see
> [2016-01-31 16:17:20.907680] E [MSGID: 108006]
> [afr-common.c:3999:afr_notify] 0-cluster1-replicate-0: All subvolumes
> are down. Going offline until atleast one of them comes back up.
> in the client logs.
> 
> This *may* have been the reason for the VMs going offline.
> 
> Also, Steve's inputs are correct wrt the distinction between server
> quorum and client quorum. Usually it is recommended that you do the
> following things while using Gluster for VM store use case:
> 
> i) use replica 3 (as opposed to replica 2) volume. In your case the
> third node should also be used to host a brick of the volume.
> You can use arbiter feature if you want to minimise the cost of
> investing in three machines.
> Check this out:
> https://gluster.readthedocs.org/en/release-3.7.0/Features/afr-arbiter-volumes/
> 
> Also if you plan to use arbiter, it is recommended that you do so with
> glusterfs-3.7.8 as it contains some critical bug fixes.
> 
> ii) Once you're done with 1), enable group virt option on the volume:
> # gluster volume set <VOLNAME> group virt
> which will initialise the volume configuration specifically meant to be
> used for VM store use case (including initialisation of the right quorum
> options) in one step.
> 
> iii) have you tried sharding yet? If not, you could give that a try too.
> It has been found to be useful for VM store workload.
> Check this out:
> http://blog.gluster.org/2015/12/introducing-shard-translator/
> 
> Let me know if this works for you.
> 
> -Krutika
> 
> 
> ------------------------------------------------------------------------
> 
>     *From: *"Steve Dainard" <sdainard at spd1.com>
>     *To: *"Dominique Roux" <dominique.roux at ungleich.ch>
>     *Cc: *"gluster-users at gluster.org List" <gluster-users at gluster.org>
>     *Sent: *Thursday, February 11, 2016 3:52:18 AM
>     *Subject: *Re: [Gluster-users] Fail of one brick lead to crash VMs
> 
>     For what it's worth, I've never been able to lose a brick in a 2 brick
>     replica volume and still be able to write data.
> 
>     I've also found the documentation confusing as to what 'Option:
>     cluster.server-quorum-type' actually means.
>     Default Value: (null)
>     Description: This feature is on the server-side i.e. in glusterd.
>     Whenever the glusterd on a machine observes that the quorum is not
>     met, it brings down the bricks to prevent data split-brains. When the
>     network connections are brought back up and the quorum is restored the
>     bricks in the volume are brought back up.
> 
>     It seems to be implying a brick quorum, but I think it actually means
>     a glusterd quorum. In other words, if 2/3 glusterd processes fail,
>     take the brick offline. This would seem to make sense in your
>     configuration.
> 
>     But
> 
>     There are also two other quorum settings which seem to be more focused
>     on brick count/ratio to form quorum:
> 
>     Option: cluster.quorum-type
>     Default Value: none
>     Description: If value is "fixed" only allow writes if quorum-count
>     bricks are present.  If value is "auto" only allow writes if more than
>     ha
>     lf of bricks, or exactly half including the first, are present.
> 
>     Option: cluster.quorum-count
>     Default Value: (null)
>     Description: If quorum-type is "fixed" only allow writes if this many
>     bricks or present.  Other quorum types will OVERWRITE this value.
> 
>     So you might be able to set type as 'fixed' and count as '1' and with
>     cluster.server-quorum-type: server
>     already enabled get what you want.
> 
>     But again, I've never had this work properly, and always ended up with
>     split-brains which are difficult to resolve when you're storing vm
>     images rather than files.
> 
>     Your other options are; use your 3rd server as another brick, and do
>     replica 3 (which I've had good success with).
> 
>     Or seeing as you're using 3.7 you could look into arbiter nodes if
>     they're stable in current version.
> 
> 
>     On Mon, Feb 8, 2016 at 6:20 AM, Dominique Roux
>     <dominique.roux at ungleich.ch> wrote:
>     > Hi guys,
>     >
>     > I faced a problem a week ago.
>     > In our environment we have three servers in a quorum. The gluster
>     volume
>     > is spreaded over two bricks and has the type replicated.
>     >
>     > We now, for simulating a fail of one brick, isolated one of the two
>     > bricks with iptables, so that communication to the other two peers
>     > wasn't possible anymore.
>     > After that VMs (opennebula) which had I/O in this time crashed.
>     > We stopped the glusterfsd hard (kill -9) and restarted it, what made
>     > things work again (Certainly we also had to restart the failed
>     VMs). But
>     > I think this shouldn't happen. Since quorum was not reached (2/3 hosts
>     > were still up and connected).
>     >
>     > Here some infos of our system:
>     > OS: CentOS Linux release 7.1.1503
>     > Glusterfs version: glusterfs 3.7.3
>     >
>     > gluster volume info:
>     >
>     > Volume Name: cluster1
>     > Type: Replicate
>     > Volume ID:
>     > Status: Started
>     > Number of Bricks: 1 x 2 = 2
>     > Transport-type: tcp
>     > Bricks:
>     > Brick1: srv01:/home/gluster
>     > Brick2: srv02:/home/gluster
>     > Options Reconfigured:
>     > cluster.self-heal-daemon: enable
>     > cluster.server-quorum-type: server
>     > network.remote-dio: enable
>     > cluster.eager-lock: enable
>     > performance.stat-prefetch: on
>     > performance.io-cache: off
>     > performance.read-ahead: off
>     > performance.quick-read: off
>     > server.allow-insecure: on
>     > nfs.disable: 1
>     >
>     > Hope you can help us.
>     >
>     > Thanks a lot.
>     >
>     > Best regards
>     > Dominique
>     > _______________________________________________
>     > Gluster-users mailing list
>     > Gluster-users at gluster.org
>     > http://www.gluster.org/mailman/listinfo/gluster-users
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org
>     http://www.gluster.org/mailman/listinfo/gluster-users
> 
>