[Gluster-users] Two VMS as arbiter...

Gilberto Nunes gilberto.nunes32 at gmail.com
Thu Aug 6 18:15:33 UTC 2020


The options that worked best in my tests were as follows, to avoid
split-brain

gluster vol set VMS cluster.heal-timeout 20
gluster volume heal VMS enable
gluster vol set VMS cluster.quorum-reads false
gluster vol set VMS cluster.quorum-count 1
gluster vol set VMS network.ping-timeout 2
gluster volume set VMS cluster.favorite-child-policy mtime
gluster volume heal VMS granular-entry-heal enable
gluster volume set VMS cluster.data-self-heal-algorithm full

Here
gluster volume set VMS cluster.favorite-child-policy mtime
I used "size" but I read in several places that mtime is better ...

I did several and exhaustive tests ... power off hosts, migrating vm,
creating folders and files inside the vm ... activating HA etc ...
After the "crash" ie after the host that was restarted / shutdown comes
back, the volume looks like this
Brick pve02: / DATA / brick
/images/100/vm-100-disk-0.qcow2 - Possibly undergoing heal
Status: Connected
Number of entries: 1

Indicating that healing is taking place ...
After a few minutes / hours depending on the hardware speed, "possibly
undergoing" disappears ...

But at no time was there data loss ...
While possibly undergoing heals I migrate the vm from one side to another
also without problems ...

Here in the tests I performed, the healing of a 10G VM HD, having 4G busy,
took 30 minutes ...
Remembering that I'm using a virtualbox with 2 vms in it with 2 G of ram
each, each vm being a proxmox.
In a real environment this time is much less and also depends on the size
of the VM's HD!

Cheers

---
Gilberto Nunes Ferreira


Em qui., 6 de ago. de 2020 às 14:14, Strahil Nikolov <hunter86_bg at yahoo.com>
escreveu:

> The settings  I got in my group is:
> [root at ovirt1 ~]# cat /var/lib/glusterd/groups/virt
> performance.quick-read=off
> performance.read-ahead=off
> performance.io-cache=off
> performance.low-prio-threads=32
> network.remote-dio=enable
> cluster.eager-lock=enable
> cluster.quorum-type=auto
> cluster.server-quorum-type=server
> cluster.data-self-heal-algorithm=full
> cluster.locking-scheme=granular
> cluster.shd-max-threads=8
> cluster.shd-wait-qlength=10000
> features.shard=on
> user.cifs=off
> cluster.choose-local=off
> client.event-threads=4
> server.event-threads=4
> performance.client-io-threads=on
>
> I'm not  sure that sharded  files are  treated  as  big or not.If your
> brick disks are faster than your network bandwidth, you can enable
> 'cluster.choose-local' .
>
> Keep in mind that some  users report issues  with sparse  qcow2  images
> during intensive writes (suspected shard  xlator cannot create fast enough
> the shards -> default shard size (64MB) is way smaller than the RedHat's
> supported size  which is 512MB)  and  I would recommend you  to use
> preallocated  qcow2  disks as  much as  possible or to bump the shard size.
>
> Sharding was  developed especially for Virt usage.
>
> Consider  using another cluster.favorite-child-policy  , as  all  shards
> have the same size.
>
> Best Regards,
> Strahil Nikolov
>
>
>
> На 6 август 2020 г. 16:37:07 GMT+03:00, Gilberto Nunes <
> gilberto.nunes32 at gmail.com> написа:
> >Oh I see... I was confused because the terms... Now I read this and
> >everything becomes clear...
> >
> >
> https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/
> >
> >
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/configuring_red_hat_virtualization_with_red_hat_gluster_storage/chap-hosting_virtual_machine_images_on_red_hat_storage_volumes
> >
> >
> >Should I use cluster.granular-entrey-heal-enable too, since I am
> >working
> >with big files?
> >
> >Thanks
> >
> >---
> >Gilberto Nunes Ferreira
> >
> >(47) 3025-5907
> >(47) 99676-7530 - Whatsapp / Telegram
> >
> >Skype: gilberto.nunes36
> >
> >
> >
> >
> >
> >Em qui., 6 de ago. de 2020 às 09:32, Gilberto Nunes <
> >gilberto.nunes32 at gmail.com> escreveu:
> >
> >> What do you mean "sharding"? Do you mean sharing folders between two
> >> servers to host qcow2 or raw vm images?
> >> Here I am using Proxmox which uses qemu but not virsh.
> >>
> >> Thanks
> >> ---
> >> Gilberto Nunes Ferreira
> >>
> >> (47) 3025-5907
> >> (47) 99676-7530 - Whatsapp / Telegram
> >>
> >> Skype: gilberto.nunes36
> >>
> >>
> >>
> >>
> >>
> >> Em qui., 6 de ago. de 2020 às 01:09, Strahil Nikolov <
> >> hunter86_bg at yahoo.com> escreveu:
> >>
> >>> As  you mentioned qcow2  files,  check the virt group
> >>> (/var/lib/glusterfs/group or something like that). It has optimal
> >setttins
> >>> for VMs and is used by oVirt.
> >>>
> >>> WARNING: If you decide to enable the group, which will also enable
> >>> sharding, NEVER EVER DISABLE SHARDING -> ONCE ENABLED STAYS ENABLED
> >!!!
> >>> Sharding helps reduce loocking during replica heals.
> >>>
> >>> WARNING2: As virt group uses sharding (fixes the size of file into
> >shard
> >>> size),  you should consider cluster.favorite-child-policy with value
> >>> ctime/mtime.
> >>>
> >>> Best Regards,
> >>> Strahil Nikolov
> >>>
> >>> На 6 август 2020 г. 1:56:58 GMT+03:00, Gilberto Nunes <
> >>> gilberto.nunes32 at gmail.com> написа:
> >>> >Ok...Thanks a lot Strahil
> >>> >
> >>> >This gluster volume set VMS cluster.favorite-child-policy size do
> >the
> >>> >trick
> >>> >to me here!
> >>> >
> >>> >Cheers
> >>> >---
> >>> >Gilberto Nunes Ferreira
> >>> >
> >>> >(47) 3025-5907
> >>> >(47) 99676-7530 - Whatsapp / Telegram
> >>> >
> >>> >Skype: gilberto.nunes36
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >Em qua., 5 de ago. de 2020 às 18:15, Strahil Nikolov
> >>> ><hunter86_bg at yahoo.com>
> >>> >escreveu:
> >>> >
> >>> >> This could happen if you have pending heals. Did you reboot that
> >node
> >>> >> recently ?
> >>> >> Did you set automatic unsplit-brain ?
> >>> >>
> >>> >> Check for pending heals and files in splitbrain.
> >>> >>
> >>> >> If not, you can check
> >>> >>
> >>>
> >>https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
> >>> >> (look at point 5).
> >>> >>
> >>> >> Best Regards,
> >>> >> Strahil Nikolov
> >>> >>
> >>> >> На 5 август 2020 г. 23:41:57 GMT+03:00, Gilberto Nunes <
> >>> >> gilberto.nunes32 at gmail.com> написа:
> >>> >> >I'm in trouble here.
> >>> >> >When I shutdown the pve01 server, the shared folder over
> >glusterfs
> >>> >is
> >>> >> >EMPTY!
> >>> >> >It's supposed to be a qcow2 file inside it.
> >>> >> >The content is show right, just after I power on pve01 backup...
> >>> >> >
> >>> >> >Some advice?
> >>> >> >
> >>> >> >
> >>> >> >Thanks
> >>> >> >
> >>> >> >---
> >>> >> >Gilberto Nunes Ferreira
> >>> >> >
> >>> >> >(47) 3025-5907
> >>> >> >(47) 99676-7530 - Whatsapp / Telegram
> >>> >> >
> >>> >> >Skype: gilberto.nunes36
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >Em qua., 5 de ago. de 2020 às 11:07, Gilberto Nunes <
> >>> >> >gilberto.nunes32 at gmail.com> escreveu:
> >>> >> >
> >>> >> >> Well...
> >>> >> >> I do the follow:
> >>> >> >>
> >>> >> >> gluster vol create VMS replica 3 arbiter 1 pve01:/DATA/brick1
> >>> >> >> pve02:/DATA/brick1.5 pve01:/DATA/arbiter1.5 pve02:/DATA/brick2
> >pv
> >>> >> >> e01:/DATA/brick2.5 pve02:/DATA/arbiter2.5 force
> >>> >> >>
> >>> >> >> And now I have:
> >>> >> >> gluster vol info
> >>> >> >>
> >>> >> >> Volume Name: VMS
> >>> >> >> Type: Distributed-Replicate
> >>> >> >> Volume ID: 1bd712f5-ccb9-4322-8275-abe363d1ffdd
> >>> >> >> Status: Started
> >>> >> >> Snapshot Count: 0
> >>> >> >> Number of Bricks: 2 x (2 + 1) = 6
> >>> >> >> Transport-type: tcp
> >>> >> >> Bricks:
> >>> >> >> Brick1: pve01:/DATA/brick1
> >>> >> >> Brick2: pve02:/DATA/brick1.5
> >>> >> >> Brick3: pve01:/DATA/arbiter1.5 (arbiter)
> >>> >> >> Brick4: pve02:/DATA/brick2
> >>> >> >> Brick5: pve01:/DATA/brick2.5
> >>> >> >> Brick6: pve02:/DATA/arbiter2.5 (arbiter)
> >>> >> >> Options Reconfigured:
> >>> >> >> cluster.quorum-count: 1
> >>> >> >> cluster.quorum-reads: false
> >>> >> >> cluster.self-heal-daemon: enable
> >>> >> >> cluster.heal-timeout: 10
> >>> >> >> storage.fips-mode-rchecksum: on
> >>> >> >> transport.address-family: inet
> >>> >> >> nfs.disable: on
> >>> >> >> performance.client-io-threads: off
> >>> >> >>
> >>> >> >> This values I have put it myself, in order to see if could
> >improve
> >>> >> >the
> >>> >> >> time to make the volume available, when pve01 goes down with
> >>> >ifupdown
> >>> >> >> cluster.quorum-count: 1
> >>> >> >> cluster.quorum-reads: false
> >>> >> >> cluster.self-heal-daemon: enable
> >>> >> >> cluster.heal-timeout: 10
> >>> >> >>
> >>> >> >> Nevertheless, it took more than 1 minutes to the volume VMS
> >>> >available
> >>> >> >in
> >>> >> >> the other host (pve02).
> >>> >> >> Is there any trick to reduce this time ?
> >>> >> >>
> >>> >> >> Thanks
> >>> >> >>
> >>> >> >> ---
> >>> >> >> Gilberto Nunes Ferreira
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >> Em qua., 5 de ago. de 2020 às 08:57, Gilberto Nunes <
> >>> >> >> gilberto.nunes32 at gmail.com> escreveu:
> >>> >> >>
> >>> >> >>> hum I see... like this:
> >>> >> >>> [image: image.png]
> >>> >> >>> ---
> >>> >> >>> Gilberto Nunes Ferreira
> >>> >> >>>
> >>> >> >>> (47) 3025-5907
> >>> >> >>> (47) 99676-7530 - Whatsapp / Telegram
> >>> >> >>>
> >>> >> >>> Skype: gilberto.nunes36
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Em qua., 5 de ago. de 2020 às 02:14, Computerisms Corporation
> ><
> >>> >> >>> bob at computerisms.ca> escreveu:
> >>> >> >>>
> >>> >> >>>> check the example of the chained configuration on this page:
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >
> >>> >>
> >>> >
> >>>
> >
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/creating_arbitrated_replicated_volumes
> >>> >> >>>>
> >>> >> >>>> and apply it to two servers...
> >>> >> >>>>
> >>> >> >>>> On 2020-08-04 8:25 p.m., Gilberto Nunes wrote:
> >>> >> >>>> > Hi Bob!
> >>> >> >>>> >
> >>> >> >>>> > Could you, please, send me more detail about this
> >>> >configuration?
> >>> >> >>>> > I will appreciate that!
> >>> >> >>>> >
> >>> >> >>>> > Thank you
> >>> >> >>>> > ---
> >>> >> >>>> > Gilberto Nunes Ferreira
> >>> >> >>>> >
> >>> >> >>>> > (47) 3025-5907
> >>> >> >>>> > **
> >>> >> >>>> > (47) 99676-7530 - Whatsapp / Telegram
> >>> >> >>>> >
> >>> >> >>>> > Skype: gilberto.nunes36
> >>> >> >>>> >
> >>> >> >>>> >
> >>> >> >>>> >
> >>> >> >>>> >
> >>> >> >>>> >
> >>> >> >>>> > Em ter., 4 de ago. de 2020 às 23:47, Computerisms
> >Corporation
> >>> >> >>>> > <bob at computerisms.ca <mailto:bob at computerisms.ca>>
> >escreveu:
> >>> >> >>>> >
> >>> >> >>>> >     Hi Gilberto,
> >>> >> >>>> >
> >>> >> >>>> >     My understanding is there can only be one arbiter per
> >>> >> >replicated
> >>> >> >>>> >     set.  I
> >>> >> >>>> >     don't have a lot of practice with gluster, so this
> >could
> >>> >be
> >>> >> >bad
> >>> >> >>>> advice,
> >>> >> >>>> >     but the way I dealt with it on my two servers was to
> >use 6
> >>> >> >bricks
> >>> >> >>>> as
> >>> >> >>>> >     distributed-replicated (this is also relatively easy
> >to
> >>> >> >migrate to
> >>> >> >>>> 3
> >>> >> >>>> >     servers if that happens for you in the future):
> >>> >> >>>> >
> >>> >> >>>> >     Server1     Server2
> >>> >> >>>> >     brick1      brick1.5
> >>> >> >>>> >     arbiter1.5  brick2
> >>> >> >>>> >     brick2.5    arbiter2.5
> >>> >> >>>> >
> >>> >> >>>> >     On 2020-08-04 7:00 p.m., Gilberto Nunes wrote:
> >>> >> >>>> >      > Hi there.
> >>> >> >>>> >      > I have two physical servers deployed as replica 2
> >and,
> >>> >> >>>> obviously,
> >>> >> >>>> >     I got
> >>> >> >>>> >      > a split-brain.
> >>> >> >>>> >      > So I am thinking in use two virtual machines,each
> >one
> >>> >in
> >>> >> >>>> physical
> >>> >> >>>> >      > servers....
> >>> >> >>>> >      > Then this two VMS act as a artiber of gluster
> >set....
> >>> >> >>>> >      >
> >>> >> >>>> >      > Is this doable?
> >>> >> >>>> >      >
> >>> >> >>>> >      > Thanks
> >>> >> >>>> >      >
> >>> >> >>>> >      > ________
> >>> >> >>>> >      >
> >>> >> >>>> >      >
> >>> >> >>>> >      >
> >>> >> >>>> >      > Community Meeting Calendar:
> >>> >> >>>> >      >
> >>> >> >>>> >      > Schedule -
> >>> >> >>>> >      > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> >>> >> >>>> >      > Bridge: https://bluejeans.com/441850968
> >>> >> >>>> >      >
> >>> >> >>>> >      > Gluster-users mailing list
> >>> >> >>>> >      > Gluster-users at gluster.org
> >>> >> ><mailto:Gluster-users at gluster.org>
> >>> >> >>>> >      >
> >>> >https://lists.gluster.org/mailman/listinfo/gluster-users
> >>> >> >>>> >      >
> >>> >> >>>> >     ________
> >>> >> >>>> >
> >>> >> >>>> >
> >>> >> >>>> >
> >>> >> >>>> >     Community Meeting Calendar:
> >>> >> >>>> >
> >>> >> >>>> >     Schedule -
> >>> >> >>>> >     Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> >>> >> >>>> >     Bridge: https://bluejeans.com/441850968
> >>> >> >>>> >
> >>> >> >>>> >     Gluster-users mailing list
> >>> >> >>>> >     Gluster-users at gluster.org
> >>> ><mailto:Gluster-users at gluster.org>
> >>> >> >>>> >
> >https://lists.gluster.org/mailman/listinfo/gluster-users
> >>> >> >>>> >
> >>> >> >>>>
> >>> >> >>>
> >>> >>
> >>>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200806/669c511b/attachment.html>


More information about the Gluster-users mailing list