[Gluster-users] New GlusterFS deployment, doubts on 1 brick per host vs 1 brick per drive.

Thu Sep 10 20:26:00 UTC 2020

hi,
thanks both for the replies

On Thu, 10 Sep 2020 at 16:08, Darrell Budic <budic at onholyground.com> wrote:

> I run ZFS on my servers (with additional RAM for that reason) in my
> replica-3 production cluster. I choose size and ZFS striping of HDDs, with
> easier compression and ZFS controlled caching using SDDs for my workload
> (mainly VMs). It performs as expected, but I don’t have the resources to
> head-to-head it against a bunch of individual bricks for testing. One
> drawback to large bricks is that it can take longer to heal, in my
> experience. I also run some smaller volumes on SSDs for VMs with databases
> and other high-IOPS workloads, and for those I use tuned XFS volumes
> because I didn’t want compression and did want faster healing.
>
> With the options you’ve presented, I’d use XFS on single bricks, there’s
> not much need for the overhead unless you REALLY want ZFS compression, and
> ZFS if you wanted higher performing volumes, mirrors, or had some cache to
> take advantage of. Or you knew your workload could take advantage of the
> things ZFS is good at, like setting specific record sizes tuned to your
> work load on sub-volumes. But that depends on how you’re planning to
> consume your storage, as file shares or as disk images. The ultimate way to
> find out, of course, is to test each configuration and see which gives you
> the most of what you want :)

yes, zfs (or btrfs ) was for compression but also for the added robustness
provided by checksums. I didnt mention btrfs but i’m confortable with btrfs
for simple volumes with compression.. but i imagine there isnt a large user
base of glusterfs + btrfs.

this is a mostly cold dataset with lots of uncompressed training data for
ML.

there is one argument for bit fat internally redundant (zfs) brick which
is:
 there is more widespread knowledge on how to manage failed drives on zfs..
one of the inputs i was seeking due to my inexperience with glusterfs is
this management side.
i didnt see on the docs how to add spare drives or what happens when a
brick dies.. what type of healing exists.. if for example there isnt a
replacement drive..

>
> And definitely get a 3rd server in there with at least enough storage to
> be an arbiter. At the level you’re talking, I’d try and deck it out
> properly and have 3 active hosts off the bat so you can have a proper
> redundancy scheme. Split brain more than sucks.

agreed, im aware of split brain. will add additional nodes asap, it is
already planned.

>
>
>  -Darrell
>
> > On Sep 10, 2020, at 1:33 AM, Diego Zuccato <diego.zuccato at unibo.it>
> wrote:
> >
> > Il 09/09/20 15:30, Miguel Mascarenhas Filipe ha scritto:
> >
> > I'm a noob, but IIUC this is the option giving the best performance:
> >
> >> 2. 1 brick per drive, Gluster "distributed replicated" volumes, no
> >> internal redundancy
> >
> > Clients can write to both servers in parallel and read scattered (read
> > performance using multiple files ~ 16x vs 2x with a single disk per
> > host). Moreover it's easier to extend.
> > But why ZFS instead of XFS ? In my experience it's heavier.
> >
> > PS: add a third host ASAP, at least for arbiter volumes (replica 3
> > arbiter 1). Split brain can be a real pain to fix!
> >
> > --
> > Diego Zuccato
> > DIFA - Dip. di Fisica e Astronomia
> > Servizi Informatici
> > Alma Mater Studiorum - Università di Bologna
> > V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> > tel.: +39 051 20 95786
> > ________
> >
> >
> >
> > Community Meeting Calendar:
> >
> > Schedule -
> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> > Bridge: https://bluejeans.com/441850968
> >
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
>
> --
Miguel Mascarenhas Filipe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200910/7da0b90d/attachment.html>