[Gluster-users] any one uses all flash GlusterFS setup?

Tue Mar 30 04:25:43 UTC 2021

MTU -> the maximum without fragmentation (you can check with 'ping -M do -s <size-28>).

 LACP is also good. On my lab I use layer 3+4 hashing (ip+port) to spred the load over multiple links.
If you use HW raid, don't forget to align LVM & XFS as per https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/brick_configuration
Also consider disabling transparent huge pages and setting cstates on the hosts (check relevant vendor articles).
P.S: with the new RH developer program, you can access all RH solutions
Best Regards,Strahil Nikolov

  On Mon, Mar 29, 2021 at 22:38, Arman Khalatyan<arm2arm at gmail.com> wrote:   great hints, thanks a lot, going to test all this tomorrow.I have an appointment with IT department for enabling the LACP bonds on 10gig dual port interfaces, so next will test the latency, and ssd+xfs+glusterfs.we did not touch default mtu, i did some tests  with CentOS7.3 on IB with 65k mtu connected mode ipoib, was ok but not stable on the large workloads, maybe now the situation has been changed.are there any suggestions on lacp-bonds and mtu sizes?

Strahil Nikolov <hunter86_bg at yahoo.com> schrieb am Mo., 29. März 2021, 19:02:

I guess there is no need to mention that lattency is the real killer of Gluster. What is the DC-to-DC lattency and MTU (ethernet) ?Also, if you use SSDs , consider using noop/none I/O schedulers.
Also, you can obtain the tuned profiles used in Red Hat Gluster Storage via this source rpm:
http://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHS/SRPMS/redhat-storage-server-3.5.0.0-7.el7rhgs.src.rpm
You can combine the settings from the tuned profile for Hypervisor and combine it with the gluster random I/O tuned profile.
Also worth mentioning, RHGS uses 512M shard size, while default in upstream gluster is just 64M. Some oVirt users have reported issues and suspect is gluster's inability to crwate enough shards.
WARNING: ONCE SHARDING IS ENABLED, NEVER EVER DISABLE IT.
Best Regards,Strahil Nikolov

  On Mon, Mar 29, 2021 at 11:03, Arman Khalatyan<arm2arm at gmail.com> wrote:   Thanks Strahil,good point on choose-local, definitely we will try.the connection is: 10Gbit, also FDR Infiniband( ipoib will be used).we are still experimenting with 2 buildings and 8 nodes ovirt+ changing the bricks number on glusterfs. 

Strahil Nikolov <hunter86_bg at yahoo.com> schrieb am So., 28. März 2021, 00:35:

It's worth mentioning that if your network bandwidth is smaller than the raid bandwidth, you can consider to enable the cluster.choose-local (which oVirt's optimizations disable) for faster reads.
Some people would also consider going with JBOD (replica 3) mode. I guess you can test both prior moving to prod phase 

P.S.: Don't forget to align the LVM/FS layer to the hardware raid.
Best Regards,Strahil Nikolov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210330/a91f795c/attachment.html>