[Gluster-users] Poor performance on a server-class system vs. desktop

Wed Nov 25 16:08:20 UTC 2020

I'm trying to investigate the poor I/O performance results
observed on a server-class system vs. the desktop-class one.

The second one is 8-core notebook with NVME disk. According to

fio --name test --filename=XXX --bs=4k --rw=randwrite --ioengine=libaio --direct=1 \
     --iodepth=128 --numjobs=1 --runtime=60 --time_based=1

this disk is able to perform 4K random writes at ~100K IOPS. When I create the
glusterfs volume using the same disk as backing store:

Volume Name: test1
Type: Replicate
Volume ID: 87bad2a9-7a4a-43fc-94d2-de72965b63d6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.1.112:/glusterfs/test1-000
Brick2: 192.168.1.112:/glusterfs/test1-001
Brick3: 192.168.1.112:/glusterfs/test1-002
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

and run:

[global]
name=ref-write
filename=testfile
ioengine=gfapi_async
volume=test1
brick=localhost
create_on_open=1
rw=randwrite
direct=1
numjobs=1
time_based=1
runtime=60

[test-4-kbytes]
bs=4k
size=1G
iodepth=128

I'm seeing ~10K IOPS. So adding an extra layer (glusterfs :-) between an I/O client
(fio in this case) and NVME disk introduces ~10x overhead. Maybe worse than expected,
but the things goes even worse when I'm switching to the server.

The server is 32-core machine with NVME disk capable to serve the same I/O pattern
at ~200K IOPS. I've expected something similar to linear scalability, i.e. ~20K
IOPS then running the same fio workload on a gluster volume. But I surprisingly
got something very close to the same ~10K IOPS as seen on the desktop-class machine.
So, here is ~20x overhead vs. ~10x one on the desktop.

The OSes are different (Fedora Core 33 on a notebook and relatively old Debian 9 on
server), but both systems runs the fairly recent 5.9.x kernels (without massive tricky
tuning via sysctl or similar methods) and glusterfs 8.2, using XFS as the filesystem
under the bricks.

I would greatly appreciate any ideas on debugging this.

Dmitry