[Gluster-users] Understanding gluster performance
Gionatan Danti
g.danti at assyoma.it
Sat Jan 25 11:21:35 UTC 2020
Il 22-01-2020 14:02 Barak Sason Rofman ha scritto:
> Hello Gionatan,
>
> Some time ago we've looked into the effect of logging in gluster on
> performance.
>
> If it's possible for you, you could try doing a performance test by
> disabling logging.
>
> In order to fully disable logging, some code needs to be modified.
>
> You can locate the method called 'skip_logging' at 'logging.c' and
> modify it so it always returns '_gf_true' (return _gf_true;).
>
> Recompiling and installing is required as well.
>
> I would be interested in knowing to what extent (if any) this had an
> affect.
>
> You can contact me if you have any doubts.
Hi Barak, thanks for you reply and sorry for the delay. As I am using
the gluster repository binary RPM, I can not immediately try to change
the source code. I think it should be possible by using SRPM, right?
That said, I collected some new stats which, to mee, seem to exclude any
logging problem; rather, it seems that a specific call (FXATTROP) take a
very long time.
Here you can find my test volume tv0 config:
[root at singularity gluster]# gluster volume status tv0 detail
Status of volume: tv0
------------------------------------------------------------------------------
Brick : Brick 172.31.255.1:/dev/shm/brick1
TCP Port : 49153
RDMA Port : 0
Online : Y
Pid : 9243
File System : tmpfs
Device : tmpfs
Mount Options : rw,seclabel,nosuid,nodev
Inode Size : N/A
Disk Space Free : 1.4GB
Total Disk Space : 3.8GB
Inode Count : 998691
Free Inodes : 998520
------------------------------------------------------------------------------
Brick : Brick 172.31.255.1:/dev/shm/brick2
TCP Port : 49154
RDMA Port : 0
Online : Y
Pid : 9263
File System : tmpfs
Device : tmpfs
Mount Options : rw,seclabel,nosuid,nodev
Inode Size : N/A
Disk Space Free : 1.4GB
Total Disk Space : 3.8GB
Inode Count : 998691
Free Inodes : 998520
As you can see, it is a very simple replica 2 volume backed by two
different /dev/shm (memory/tmpfs) directory. I am using /dev/shm to
discount any physical fsync latency. I mounted it via gluster native
client.
Runnig fio produce the following ouput:
[root at singularity tmp]# fio --name=test --filename=/mnt/fuse/test.img
--size=256M --rw=randwrite --bs=4k --fsync=1
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=psync, iodepth=1
fio-3.7
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=2164KiB/s][r=0,w=541 IOPS][eta
00m:00s]2s]
test: (groupid=0, jobs=1): err= 0: pid=10612: Sat Jan 25 12:03:14 2020
write: IOPS=540, BW=2163KiB/s (2215kB/s)(256MiB/121181msec)
So, about 550 IOPs. For an all in-memory volume, it seems very slow.
Now the interesting part: first gluster client profile:
=== Interval 4 stats ===
Duration : 19 secs
BytesRead : 0
BytesWritten : 41852928
Block Size : 4096B+
Read Count : 0
Write Count : 10218
Fop Call Count Avg-Latency Min-Latency Max-Latency
--- ---------- ----------- ----------- -----------
WRITE 9234 26.97 us 17.27 us 153.22 us
STATFS 1 306.35 us 306.35 us 306.35 us
FSYNC 9233 1638.45 us 810.85 us 3343.30 us
SETXATTR 1 559.61 us 559.61 us 559.61 us
LOOKUP 1 445.29 us 445.29 us 445.29 us
As you can see, the client see many slow fsync. Now, server side (please
consider that client and server profile were taken at different time
point, but running the exactly same fio workload at the same
performance):
Brick: 172.31.255.1:/dev/shm/brick1
-----------------------------------
Cumulative Stats:
Block Size: 4096b+
No. of Reads: 0
No. of Writes: 102916
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 2
RELEASE
0.00 0.00 us 0.00 us 0.00 us 3
RELEASEDIR
10.94 46.20 us 23.20 us 222.63 us 6969
FSYNC
17.35 36.66 us 15.67 us 221.59 us 13937
FINODELK
18.51 78.22 us 44.56 us 284.91 us 6968
WRITE
53.20 112.38 us 70.46 us 355.08 us 13937
FXATTROP
Duration: 246 seconds
Data Read: 0 bytes
Data Written: 421543936 bytes
Interval 1 Stats:
Block Size: 4096b+
No. of Reads: 0
No. of Writes: 3497
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
11.01 46.42 us 23.20 us 222.63 us 3498
FSYNC
17.30 36.49 us 15.94 us 195.65 us 6995
FINODELK
18.42 77.70 us 44.59 us 257.26 us 3497
WRITE
53.28 112.37 us 71.49 us 355.08 us 6995
FXATTROP
Duration: 6 seconds
Data Read: 0 bytes
Data Written: 14323712 bytes
Brick: 172.31.255.1:/dev/shm/brick2
-----------------------------------
Cumulative Stats:
Block Size: 4096b+
No. of Reads: 0
No. of Writes: 102917
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 2
RELEASE
0.00 0.00 us 0.00 us 0.00 us 3
RELEASEDIR
11.05 44.83 us 23.04 us 212.17 us 6968
FSYNC
17.39 35.28 us 15.63 us 200.36 us 13938
FINODELK
18.04 73.17 us 42.60 us 261.36 us 6969
WRITE
53.52 108.54 us 70.47 us 353.13 us 13938
FXATTROP
Duration: 246 seconds
Data Read: 0 bytes
Data Written: 421548032 bytes
Interval 1 Stats:
Block Size: 4096b+
No. of Reads: 0
No. of Writes: 3498
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
11.24 46.52 us 23.38 us 212.17 us 3497
FSYNC
17.58 36.37 us 16.13 us 194.07 us 6996
FINODELK
18.07 74.77 us 42.72 us 237.45 us 3498
WRITE
53.11 109.90 us 70.87 us 353.13 us 6995
FXATTROP
Duration: 6 seconds
Data Read: 0 bytes
Data Written: 14327808 bytes
Now, the slowest operation is FXATTROP, which seems executed at a ratio
of 2x FSYNC.
So, some questions:
- as my fio 4k sync writes is running at 550 IOPs, why I see so many
FSYNC and FXATTROP?
- are my stats broken/wrong?
- are the log-disabled test still necessary?
Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8
More information about the Gluster-users
mailing list