[Gluster-users] Gluster Performance - 12 Gbps SSDs and 10 Gbps NIC

Tue Dec 12 20:12:11 UTC 2023

Ah that's nice.
Somebody knows this can be achieved with two servers?

---
Gilberto Nunes Ferreira
(47) 99676-7530 - Whatsapp / Telegram

Em ter., 12 de dez. de 2023 às 17:08, Danny <dbray925+gluster at gmail.com>
escreveu:

> Wow, HUGE improvement with NFS-Ganesha!
>
> sudo dnf -y install glusterfs-ganesha
> sudo vim /etc/ganesha/ganesha.conf
>
> NFS_CORE_PARAM {
>     mount_path_pseudo = true;
>     Protocols = 3,4;
> }
> EXPORT_DEFAULTS {
>     Access_Type = RW;
> }
>
> LOG {
>     Default_Log_Level = WARN;
> }
>
> EXPORT{
>     Export_Id = 1 ;     # Export ID unique to each export
>     Path = "/data";     # Path of the volume to be exported
>
>     FSAL {
>         name = GLUSTER;
>         hostname = "localhost"; # IP of one of the nodes in the trusted
> pool
>         volume = "data";        # Volume name. Eg: "test_volume"
>     }
>
>     Access_type = RW;           # Access permissions
>     Squash = No_root_squash;    # To enable/disable root squashing
>     Disable_ACL = TRUE;         # To enable/disable ACL
>     Pseudo = "/data";           # NFSv4 pseudo path for this export
>     Protocols = "3","4" ;       # NFS protocols supported
>     Transports = "UDP","TCP" ;  # Transport protocols supported
>     SecType = "sys";            # Security flavors supported
> }
>
>
> sudo systemctl enable --now nfs-ganesha
> sudo vim /etc/fstab
>
> localhost:/data             /data                 nfs
> defaults,_netdev          0 0
>
> sudo systemctl daemon-reload
> sudo mount -a
>
> fio --name=test --filename=/data/wow --size=1G --readwrite=write
>
> Run status group 0 (all jobs):
>   WRITE: bw=2246MiB/s (2355MB/s), 2246MiB/s-2246MiB/s (2355MB/s-2355MB/s),
> io=1024MiB (1074MB), run=456-456msec
>
> Yeah 2355MB/s is much better than the original 115MB/s
>
> So in the end, I guess FUSE isn't the best choice.
>
> On Tue, Dec 12, 2023 at 3:00 PM Gilberto Ferreira <
> gilberto.nunes32 at gmail.com> wrote:
>
>> Fuse there some overhead.
>> Take a look at libgfapi:
>>
>> https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/libgfapi/
>>
>> I know this doc somehow is out of date, but could be a hint
>>
>>
>> ---
>> Gilberto Nunes Ferreira
>> (47) 99676-7530 - Whatsapp / Telegram
>>
>>
>>
>>
>>
>>
>> Em ter., 12 de dez. de 2023 às 16:29, Danny <dbray925+gluster at gmail.com>
>> escreveu:
>>
>>> Nope, not a caching thing. I've tried multiple different types of fio
>>> tests, all produce the same results. Gbps when hitting the disks locally,
>>> slow MB\s when hitting the Gluster FUSE mount.
>>>
>>> I've been reading up on glustr-ganesha, and will give that a try.
>>>
>>> On Tue, Dec 12, 2023 at 1:58 PM Ramon Selga <ramon.selga at gmail.com>
>>> wrote:
>>>
>>>> Dismiss my first question: you have SAS 12Gbps SSDs  Sorry!
>>>>
>>>> El 12/12/23 a les 19:52, Ramon Selga ha escrit:
>>>>
>>>> May ask you which kind of disks you have in this setup? rotational, ssd
>>>> SAS/SATA, nvme?
>>>>
>>>> Is there a RAID controller with writeback caching?
>>>>
>>>> It seems to me your fio test on local brick has a unclear result due to
>>>> some caching.
>>>>
>>>> Try something like (you can consider to increase test file size
>>>> depending of your caching memory) :
>>>>
>>>> fio --size=16G --name=test --filename=/gluster/data/brick/wow --bs=1M
>>>> --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers
>>>> --end_fsync=1 --iodepth=200 --ioengine=libaio
>>>>
>>>> Also remember a replica 3 arbiter 1 volume writes synchronously to two
>>>> data bricks, halving throughput of your network backend.
>>>>
>>>> Try similar fio on gluster mount but I hardly see more than 300MB/s
>>>> writing sequentially on only one fuse mount even with nvme backend. On the
>>>> other side, with 4 to 6 clients, you can easily reach 1.5GB/s of aggregate
>>>> throughput
>>>>
>>>> To start, I think is better to try with default parameters for your
>>>> replica volume.
>>>>
>>>> Best regards!
>>>>
>>>> Ramon
>>>>
>>>>
>>>> El 12/12/23 a les 19:10, Danny ha escrit:
>>>>
>>>> Sorry, I noticed that too after I posted, so I instantly upgraded to
>>>> 10. Issue remains.
>>>>
>>>> On Tue, Dec 12, 2023 at 1:09 PM Gilberto Ferreira <
>>>> gilberto.nunes32 at gmail.com> wrote:
>>>>
>>>>> I strongly suggest you update to version 10 or higher.
>>>>> It's come with significant improvement regarding performance.
>>>>> ---
>>>>> Gilberto Nunes Ferreira
>>>>> (47) 99676-7530 - Whatsapp / Telegram
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Em ter., 12 de dez. de 2023 às 13:03, Danny <
>>>>> dbray925+gluster at gmail.com> escreveu:
>>>>>
>>>>>> MTU is already 9000, and as you can see from the IPERF results, I've
>>>>>> got a nice, fast connection between the nodes.
>>>>>>
>>>>>> On Tue, Dec 12, 2023 at 9:49 AM Strahil Nikolov <
>>>>>> hunter86_bg at yahoo.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Let’s try the simple things:
>>>>>>>
>>>>>>> Check if you can use MTU9000 and if it’s possible, set it on the
>>>>>>> Bond Slaves and the bond devices:
>>>>>>>  ping GLUSTER_PEER -c 10 -M do -s 8972
>>>>>>>
>>>>>>> Then try to follow up the recommendations from
>>>>>>> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/chap-configuring_red_hat_storage_for_enhancing_performance
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Strahil Nikolov
>>>>>>>
>>>>>>> On Monday, December 11, 2023, 3:32 PM, Danny <
>>>>>>> dbray925+gluster at gmail.com> wrote:
>>>>>>>
>>>>>>> Hello list, I'm hoping someone can let me know what setting I missed.
>>>>>>>
>>>>>>> Hardware:
>>>>>>> Dell R650 servers, Dual 24 Core Xeon 2.8 GHz, 1 TB RAM
>>>>>>> 8x SSD s Negotiated Speed 12 Gbps
>>>>>>> PERC H755 Controller - RAID 6
>>>>>>> Created virtual "data" disk from the above 8 SSD drives, for a ~20
>>>>>>> TB /dev/sdb
>>>>>>>
>>>>>>> OS:
>>>>>>> CentOS Stream
>>>>>>> kernel-4.18.0-526.el8.x86_64
>>>>>>> glusterfs-7.9-1.el8.x86_64
>>>>>>>
>>>>>>> IPERF Test between nodes:
>>>>>>> [ ID] Interval           Transfer     Bitrate         Retr
>>>>>>> [  5]   0.00-10.00  sec  11.5 GBytes  9.90 Gbits/sec    0
>>>>>>>   sender
>>>>>>> [  5]   0.00-10.04  sec  11.5 GBytes  9.86 Gbits/sec
>>>>>>>  receiver
>>>>>>>
>>>>>>> All good there. ~10 Gbps, as expected.
>>>>>>>
>>>>>>> LVM Install:
>>>>>>> export DISK="/dev/sdb"
>>>>>>> sudo parted --script $DISK "mklabel gpt"
>>>>>>> sudo parted --script $DISK "mkpart primary 0% 100%"
>>>>>>> sudo parted --script $DISK "set 1 lvm on"
>>>>>>> sudo pvcreate --dataalignment 128K /dev/sdb1
>>>>>>> sudo vgcreate --physicalextentsize 128K gfs_vg /dev/sdb1
>>>>>>> sudo lvcreate -L 16G -n gfs_pool_meta gfs_vg
>>>>>>> sudo lvcreate -l 95%FREE -n gfs_pool gfs_vg
>>>>>>> sudo lvconvert --chunksize 1280K --thinpool gfs_vg/gfs_pool
>>>>>>> --poolmetadata gfs_vg/gfs_pool_meta
>>>>>>> sudo lvchange --zero n gfs_vg/gfs_pool
>>>>>>> sudo lvcreate -V 19.5TiB --thinpool gfs_vg/gfs_pool -n gfs_lv
>>>>>>> sudo mkfs.xfs -f -i size=512 -n size=8192 -d su=128k,sw=10
>>>>>>> /dev/mapper/gfs_vg-gfs_lv
>>>>>>> sudo vim /etc/fstab
>>>>>>> /dev/mapper/gfs_vg-gfs_lv   /gluster/data/brick   xfs
>>>>>>> rw,inode64,noatime,nouuid 0 0
>>>>>>>
>>>>>>> sudo systemctl daemon-reload && sudo mount -a
>>>>>>> fio --name=test --filename=/gluster/data/brick/wow --size=1G
>>>>>>> --readwrite=write
>>>>>>>
>>>>>>> Run status group 0 (all jobs):
>>>>>>>   WRITE: bw=2081MiB/s (2182MB/s), 2081MiB/s-2081MiB/s
>>>>>>> (2182MB/s-2182MB/s), io=1024MiB (1074MB), run=492-492msec
>>>>>>>
>>>>>>> All good there. 2182MB/s =~ 17.5 Gbps. Nice!
>>>>>>>
>>>>>>>
>>>>>>> Gluster install:
>>>>>>> export NODE1='10.54.95.123'
>>>>>>> export NODE2='10.54.95.124'
>>>>>>> export NODE3='10.54.95.125'
>>>>>>> sudo gluster peer probe $NODE2
>>>>>>> sudo gluster peer probe $NODE3
>>>>>>> sudo gluster volume create data replica 3 arbiter 1
>>>>>>> $NODE1:/gluster/data/brick $NODE2:/gluster/data/brick
>>>>>>> $NODE3:/gluster/data/brick force
>>>>>>> sudo gluster volume set data network.ping-timeout 5
>>>>>>> sudo gluster volume set data performance.client-io-threads on
>>>>>>> sudo gluster volume set data group metadata-cache
>>>>>>> sudo gluster volume start data
>>>>>>> sudo gluster volume info all
>>>>>>>
>>>>>>> Volume Name: data
>>>>>>> Type: Replicate
>>>>>>> Volume ID: b52b5212-82c8-4b1a-8db3-52468bc0226e
>>>>>>> Status: Started
>>>>>>> Snapshot Count: 0
>>>>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>>>>> Transport-type: tcp
>>>>>>> Bricks:
>>>>>>> Brick1: 10.54.95.123:/gluster/data/brick
>>>>>>> Brick2: 10.54.95.124:/gluster/data/brick
>>>>>>> Brick3: 10.54.95.125:/gluster/data/brick (arbiter)
>>>>>>> Options Reconfigured:
>>>>>>> network.inode-lru-limit: 200000
>>>>>>> performance.md-cache-timeout: 600
>>>>>>> performance.cache-invalidation: on
>>>>>>> performance.stat-prefetch: on
>>>>>>> features.cache-invalidation-timeout: 600
>>>>>>> features.cache-invalidation: on
>>>>>>> network.ping-timeout: 5
>>>>>>> transport.address-family: inet
>>>>>>> storage.fips-mode-rchecksum: on
>>>>>>> nfs.disable: on
>>>>>>> performance.client-io-threads: on
>>>>>>>
>>>>>>> sudo vim /etc/fstab
>>>>>>> localhost:/data             /data                 glusterfs
>>>>>>> defaults,_netdev      0 0
>>>>>>>
>>>>>>> sudo systemctl daemon-reload && sudo mount -a
>>>>>>> fio --name=test --filename=/data/wow --size=1G --readwrite=write
>>>>>>>
>>>>>>> Run status group 0 (all jobs):
>>>>>>>   WRITE: bw=109MiB/s (115MB/s), 109MiB/s-109MiB/s (115MB/s-115MB/s),
>>>>>>> io=1024MiB (1074MB), run=9366-9366msec
>>>>>>>
>>>>>>> Oh no, what's wrong? From 2182MB/s down to only 115MB/s? What am I
>>>>>>> missing? I'm not expecting the above ~17 Gbps, but I'm thinking it should
>>>>>>> at least be close(r) to ~10 Gbps.
>>>>>>>
>>>>>>> Any suggestions?
>>>>>>> ________
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Community Meeting Calendar:
>>>>>>>
>>>>>>> Schedule -
>>>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>>>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>> ________
>>>>>>
>>>>>>
>>>>>>
>>>>>> Community Meeting Calendar:
>>>>>>
>>>>>> Schedule -
>>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>> ________
>>>>
>>>>
>>>>
>>>> Community Meeting Calendar:
>>>>
>>>> Schedule -
>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>>> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>> ________
>>>>
>>>>
>>>>
>>>> Community Meeting Calendar:
>>>>
>>>> Schedule -
>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>> ________
>>>
>>>
>>>
>>> Community Meeting Calendar:
>>>
>>> Schedule -
>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20231212/e3025603/attachment.html>