[Gluster-users] Gluster Performance - 12 Gbps SSDs and 10 Gbps NIC
Aravinda
aravinda at kadalu.tech
Wed Dec 13 05:36:34 UTC 2023
Only Replica 2 or Distributed Gluster volumes can be created with two servers. High chance of split brain with Replica 2 compared to Replica 3 volume.
For NFS Ganesha, no issue exporting the volume even if only one server is available. Run NFS Ganesha servers in Gluster server nodes and NFS clients from the network can connect to any NFS Ganesha server.
You can use Haproxy + Keepalived (or any other load balancer) if high availability required for the NFS Ganesha connections (Ex: If a server node goes down, then nfs client can connect to other NFS ganesha server node).
--
Aravinda
Kadalu Technologies
---- On Wed, 13 Dec 2023 01:42:11 +0530 Gilberto Ferreira <gilberto.nunes32 at gmail.com> wrote ---
Ah that's nice.Somebody knows this can be achieved with two servers?
---
Gilberto Nunes Ferreira
(47) 99676-7530 - Whatsapp / Telegram
Em ter., 12 de dez. de 2023 às 17:08, Danny <mailto:dbray925%2Bgluster at gmail.com> escreveu:
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
mailto:Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
Wow, HUGE improvement with NFS-Ganesha!
sudo dnf -y install glusterfs-ganesha
sudo vim /etc/ganesha/ganesha.conf
NFS_CORE_PARAM {
mount_path_pseudo = true;
Protocols = 3,4;
}
EXPORT_DEFAULTS {
Access_Type = RW;
}
LOG {
Default_Log_Level = WARN;
}
EXPORT{
Export_Id = 1 ; # Export ID unique to each export
Path = "/data"; # Path of the volume to be exported
FSAL {
name = GLUSTER;
hostname = "localhost"; # IP of one of the nodes in the trusted pool
volume = "data"; # Volume name. Eg: "test_volume"
}
Access_type = RW; # Access permissions
Squash = No_root_squash; # To enable/disable root squashing
Disable_ACL = TRUE; # To enable/disable ACL
Pseudo = "/data"; # NFSv4 pseudo path for this export
Protocols = "3","4" ; # NFS protocols supported
Transports = "UDP","TCP" ; # Transport protocols supported
SecType = "sys"; # Security flavors supported
}
sudo systemctl enable --now nfs-ganesha
sudo vim /etc/fstab
localhost:/data /data nfs defaults,_netdev 0 0
sudo systemctl daemon-reload
sudo mount -a
fio --name=test --filename=/data/wow --size=1G --readwrite=write
Run status group 0 (all jobs):
WRITE: bw=2246MiB/s (2355MB/s), 2246MiB/s-2246MiB/s (2355MB/s-2355MB/s), io=1024MiB (1074MB), run=456-456msec
Yeah
2355MB/s is much better than the original 115MB/s
So in the end, I guess FUSE isn't the best choice.
On Tue, Dec 12, 2023 at 3:00 PM Gilberto Ferreira <mailto:gilberto.nunes32 at gmail.com> wrote:
Fuse there some overhead.Take a look at libgfapi:
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/libgfapi/
I know this doc somehow is out of date, but could be a hint
---
Gilberto Nunes Ferreira
(47) 99676-7530 - Whatsapp / Telegram
Em ter., 12 de dez. de 2023 às 16:29, Danny <mailto:dbray925%2Bgluster at gmail.com> escreveu:
Nope, not a caching thing. I've tried multiple different types of fio tests, all produce the same results. Gbps when hitting the disks locally, slow MB\s when hitting the Gluster FUSE mount.
I've been reading up on glustr-ganesha, and will give that a try.
On Tue, Dec 12, 2023 at 1:58 PM Ramon Selga <mailto:ramon.selga at gmail.com> wrote:
Dismiss my first question: you have SAS
12Gbps SSDs Sorry!
El 12/12/23 a les 19:52, Ramon Selga ha
escrit:
May ask you which kind of disks you have
in this setup? rotational, ssd SAS/SATA, nvme?
Is there a RAID controller with writeback caching?
It seems to me your fio test on local brick has a unclear result
due to some caching.
Try something like (you can consider to increase test file size
depending of your caching memory) :
fio --size=16G --name=test --filename=/gluster/data/brick/wow
--bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0
--rw=write --refill_buffers --end_fsync=1 --iodepth=200
--ioengine=libaio
Also remember a replica 3 arbiter 1 volume writes
synchronously to two data bricks, halving throughput of your
network backend.
Try similar fio on gluster mount but I hardly see more than
300MB/s writing sequentially on only one fuse mount even with nvme
backend. On the other side, with 4 to 6 clients, you can easily
reach 1.5GB/s of aggregate throughput
To start, I think is better to try with default parameters for
your replica volume.
Best regards!
Ramon
El 12/12/23 a les 19:10, Danny ha
escrit:
Sorry, I noticed that too after I posted, so I
instantly upgraded to 10. Issue remains.
On Tue, Dec 12, 2023 at
1:09 PM Gilberto Ferreira <mailto:gilberto.nunes32 at gmail.com>
wrote:
I strongly suggest you update to version 10
or higher.
It's come with significant improvement
regarding performance.
---
Gilberto Nunes Ferreira
(47)
99676-7530 - Whatsapp / Telegram
Em ter., 12 de dez. de
2023 às 13:03, Danny <mailto:dbray925%2Bgluster at gmail.com>
escreveu:
MTU is already 9000, and as you can see
from the IPERF results, I've got a nice, fast
connection between the nodes.
On Tue, Dec 12, 2023
at 9:49 AM Strahil Nikolov <mailto:hunter86_bg at yahoo.com>
wrote:
Hi,
Let’s try the simple things:
Check if you can use MTU9000 and if it’s
possible, set it on the Bond Slaves and the bond
devices:
ping GLUSTER_PEER -c 10
-M do -s 8972
Then try to follow up the recommendations from https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/chap-configuring_red_hat_storage_for_enhancing_performance
Best Regards,
Strahil Nikolov
On
Monday, December 11, 2023, 3:32 PM, Danny <mailto:dbray925%2Bgluster at gmail.com>
wrote:
Hello list, I'm hoping someone can
let me know what setting I missed.
Hardware:
Dell R650 servers, Dual 24 Core Xeon
2.8 GHz, 1 TB RAM
8x SSD s Negotiated Speed 12 Gbps
PERC H755 Controller - RAID 6
Created virtual "data" disk from the
above 8 SSD drives, for a ~20 TB
/dev/sdb
OS:
CentOS Stream
kernel-4.18.0-526.el8.x86_64
glusterfs-7.9-1.el8.x86_64
IPERF Test between nodes:
[ ID] Interval Transfer
Bitrate Retr
[ 5] 0.00-10.00 sec 11.5 GBytes
9.90 Gbits/sec 0 sender
[ 5] 0.00-10.04 sec 11.5 GBytes
9.86 Gbits/sec
receiver
All good there. ~10 Gbps, as
expected.
LVM Install:
export DISK="/dev/sdb"
sudo parted --script $DISK "mklabel gpt"
sudo parted --script $DISK "mkpart
primary 0% 100%"
sudo parted --script $DISK "set 1 lvm
on"
sudo pvcreate --dataalignment 128K
/dev/sdb1
sudo vgcreate --physicalextentsize 128K
gfs_vg /dev/sdb1
sudo lvcreate -L 16G -n gfs_pool_meta
gfs_vg
sudo lvcreate -l 95%FREE -n gfs_pool
gfs_vg
sudo lvconvert --chunksize 1280K
--thinpool gfs_vg/gfs_pool
--poolmetadata gfs_vg/gfs_pool_meta
sudo lvchange --zero n gfs_vg/gfs_pool
sudo lvcreate -V 19.5TiB --thinpool
gfs_vg/gfs_pool -n gfs_lv
sudo mkfs.xfs -f -i size=512 -n
size=8192 -d su=128k,sw=10
/dev/mapper/gfs_vg-gfs_lv
sudo vim /etc/fstab
/dev/mapper/gfs_vg-gfs_lv
/gluster/data/brick xfs
rw,inode64,noatime,nouuid 0 0
sudo systemctl daemon-reload
&& sudo mount -a
fio --name=test
--filename=/gluster/data/brick/wow
--size=1G --readwrite=write
Run status group 0 (all jobs):
WRITE: bw=2081MiB/s (2182MB/s),
2081MiB/s-2081MiB/s (2182MB/s-2182MB/s),
io=1024MiB (1074MB), run=492-492msec
All good there. 2182MB/s =~ 17.5
Gbps. Nice!
Gluster install:
export NODE1='10.54.95.123'
export NODE2='10.54.95.124'
export NODE3='10.54.95.125'
sudo gluster peer probe $NODE2
sudo gluster peer probe $NODE3
sudo gluster volume create data replica
3 arbiter 1 $NODE1:/gluster/data/brick
$NODE2:/gluster/data/brick
$NODE3:/gluster/data/brick force
sudo gluster volume set data
network.ping-timeout 5
sudo gluster volume set data
performance.client-io-threads on
sudo gluster volume set data group
metadata-cache
sudo gluster volume start data
sudo gluster volume info all
Volume Name: data
Type: Replicate
Volume ID:
b52b5212-82c8-4b1a-8db3-52468bc0226e
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.54.95.123:/gluster/data/brick
Brick2: 10.54.95.124:/gluster/data/brick
Brick3: 10.54.95.125:/gluster/data/brick
(arbiter)
Options Reconfigured:
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
network.ping-timeout: 5
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
performance.client-io-threads: on
sudo vim /etc/fstab
localhost:/data /data
glusterfs defaults,_netdev
0 0
sudo systemctl daemon-reload
&& sudo mount -a
fio --name=test --filename=/data/wow
--size=1G --readwrite=write
Run status group 0 (all jobs):
WRITE: bw=109MiB/s (115MB/s),
109MiB/s-109MiB/s (115MB/s-115MB/s),
io=1024MiB (1074MB), run=9366-9366msec
Oh no, what's wrong? From 2182MB/s
down to only 115MB/s? What am I missing?
I'm not expecting the above ~17 Gbps,
but I'm thinking it should at least be
close(r) to ~10 Gbps.
Any suggestions?
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00
UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
mailto:Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
mailto:Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
mailto:Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
mailto:Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
mailto:Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
mailto:Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20231213/15c8ab87/attachment.html>
More information about the Gluster-users
mailing list