[Gluster-users] Poor performance on a server-class system vs. desktop
dmantipov at yandex.ru
Wed Nov 25 16:08:20 UTC 2020
I'm trying to investigate the poor I/O performance results
observed on a server-class system vs. the desktop-class one.
The second one is 8-core notebook with NVME disk. According to
fio --name test --filename=XXX --bs=4k --rw=randwrite --ioengine=libaio --direct=1 \
--iodepth=128 --numjobs=1 --runtime=60 --time_based=1
this disk is able to perform 4K random writes at ~100K IOPS. When I create the
glusterfs volume using the same disk as backing store:
Volume Name: test1
Volume ID: 87bad2a9-7a4a-43fc-94d2-de72965b63d6
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
I'm seeing ~10K IOPS. So adding an extra layer (glusterfs :-) between an I/O client
(fio in this case) and NVME disk introduces ~10x overhead. Maybe worse than expected,
but the things goes even worse when I'm switching to the server.
The server is 32-core machine with NVME disk capable to serve the same I/O pattern
at ~200K IOPS. I've expected something similar to linear scalability, i.e. ~20K
IOPS then running the same fio workload on a gluster volume. But I surprisingly
got something very close to the same ~10K IOPS as seen on the desktop-class machine.
So, here is ~20x overhead vs. ~10x one on the desktop.
The OSes are different (Fedora Core 33 on a notebook and relatively old Debian 9 on
server), but both systems runs the fairly recent 5.9.x kernels (without massive tricky
tuning via sysctl or similar methods) and glusterfs 8.2, using XFS as the filesystem
under the bricks.
I would greatly appreciate any ideas on debugging this.
More information about the Gluster-users