[Gluster-users] Performance Questions - not only small files
Schlick Rupert
Rupert.Schlick at ait.ac.at
Fri May 14 19:07:48 UTC 2021
Dear list,
we have replicated gluster volumes much slower than the brick disks and I wonder if this is a configuration issue, a conceptual issue of our setup or really how slow gluster just is.
The setup:
* Three servers in a ring connected via IP over Infiniband, 100Gb/s on each link
* 2 x U.2 2TB-SSDs as RAID1, on a Megaraid controller, connected via NVMe links on each node
* A 3x replicated volume on a thin-pool LVM on the SSD-RAIDs.
* 2 x 26 cores, lots of RAM
* The volumes are locally mounted and used as shared disks for computation jobs (CPU and GPU) on the nodes.
* The LVM thinpool is shared with other volumes and with a cache pool for the hard disks.
The measurement:
* Iozone in automated mode, on the gluster volume (Set1) and on the mounted brick disk (Baseline)
* Compared with iozone_results_comparator.py
The issue:
* Smaller files are around factor 2 slower than larger files on the SSD, but factor 6-10 slower than larger files on the gluster volume (somewhat expected)
* Larger files are, except for certain read accesses, still factor 3-10 slower on the gluster volume than on the SSD RAID directly, depending on the operation.
* Operations with many, admittedly smaller files (checkouts, copying, rsync and unpacking) can extend into hours, where they take tens of seconds to few minutes on disk.
* atop sometimes shows 9x% busy on some of the LVM block devices - and sometimes the avio value increases from 4-26ms to 3000-5000ms. Otherwise, there is nothing mentionable observed regarding system load.
This does not seem to be network bound. Accessing from a remote VM via glusterfs-mount is not much slower despite connecting via 10GbE instead of 100GB. Nethogs shows sent traffic on the Infiniband interface going up to 30000kB/s for large records - 240Mb/s over a 100Gb/s Interface ...
The machine itself is mostly idle - no iowaits, a load average of 6 on 104 logical cpus. Glusterfsd is sometimes pushing up to 10% CPU. I just do not see what the bottleneck should be.
Below a summary out of the compared iozone reports.
Any hints if it is worth trying to optimize this at all and where to start looking? I believe I have checked all the "standard" hints, but might have missed something.
The alternative would be a single storage node connected via nfs, sacrificing the probably not needed redundancy/high availability of the replicated filesystem.
Thanks a lot for any comments in advance,
Best
Rupert
[summary]
Operation
Write<file:///V:/rps/iozone/html_out/index.html#iwrite>
Re-write<file:///V:/rps/iozone/html_out/index.html#rewrite>
Read<file:///V:/rps/iozone/html_out/index.html#iread>
Re-read<file:///V:/rps/iozone/html_out/index.html#reread>
Random read<file:///V:/rps/iozone/html_out/index.html#randrd>
Random write<file:///V:/rps/iozone/html_out/index.html#randwr>
Backwards read<file:///V:/rps/iozone/html_out/index.html#bkwdrd>
Record rewrite<file:///V:/rps/iozone/html_out/index.html#recrewr>
Strided Read<file:///V:/rps/iozone/html_out/index.html#striderd>
Fwrite<file:///V:/rps/iozone/html_out/index.html#fwrite>
Frewrite<file:///V:/rps/iozone/html_out/index.html#frewrite>
Fread<file:///V:/rps/iozone/html_out/index.html#fread>
Freread<file:///V:/rps/iozone/html_out/index.html#freread>
ALL<file:///V:/rps/iozone/html_out/index.html#ALL>
baseline
first quartile
1260.16
1894.11
3171.44
3002.55
3679.12
2214.07
2987.15
2701.4
3361.62
1783.45
1739.6
3231.97
3405.8
2152.28
median
1474.31
2145.7
3637.6
3383.24
4167.16
2570.24
3435.44
3751.41
3967.55
2059.25
1992.64
3611.78
3936.96
3125.35
third quartile
1595.17
2318.0
3992.78
3840.0
4803.77
3013.13
3864.87
4420.33
4673.13
2354.6
2258.15
4036.73
4570.02
3915.54
minimum
194.64
960.92
1785.65
2152.15
1840.86
947.03
1491.1
1135.78
1732.88
999.01
933.54
1745.15
2083.72
194.64
maximum
2041.55
2819.84
5628.61
7342.68
7142.61
4776.16
6091.17
15908.02
6922.3
3540.22
3787.35
5933.0
7510.19
15908.02
mean val.
1408.31
2070.62
3663.03
3505.74
4231.31
2633.07
3444.43
4102.42
4029.73
2087.25
2027.23
3641.8
4113.02
3150.61
standard dev.
306.93
366.43
652.73
816.6
1048.38
781.84
796.13
2277.33
1041.12
511.69
464.33
687.27
1012.13
1335.36
ci. min. 90%
1363.0
2016.52
3566.67
3385.19
4076.54
2517.65
3326.9
3766.21
3876.03
2011.71
1958.68
3540.34
3963.6
3096.31
ci. max. 90%
1453.62
2124.71
3759.39
3626.3
4386.08
2748.5
3561.96
4438.62
4183.44
2162.79
2095.78
3743.27
4262.44
3204.91
geom. mean
1362.58
2034.2
3604.92
3422.21
4094.96
2511.93
3346.74
3637.05
3892.76
2024.11
1975.46
3576.11
3996.51
2883.44
set1
first quartile
173.4
174.43
208.58
2813.71
3498.77
139.63
51.51
91.49
124.34
201.55
199.05
187.72
2509.21
189.36
median
312.05
293.47
267.53
3134.22
4005.72
332.09
110.38
302.02
216.41
359.67
355.33
234.42
2742.35
357.31
third quartile
414.07
410.15
401.42
3446.69
4429.1
436.45
178.94
429.6
397.15
463.62
480.29
317.98
3092.73
686.35
minimum
44.11
45.6
59.07
2089.07
2004.07
9.39
8.67
2.77
32.58
38.63
41.09
26.0
1359.63
2.77
maximum
606.95
610.49
4457.8
6248.75
6153.85
666.55
374.16
696.91
2593.43
647.26
700.02
451.39
4353.18
6248.75
mean val.
308.37
309.17
544.27
3181.18
3964.96
305.76
123.49
282.89
402.01
337.11
342.11
248.76
2780.37
1010.03
standard dev.
154.66
157.37
892.64
635.03
825.98
183.62
87.03
191.43
493.51
160.15
173.1
100.31
484.33
1358.34
ci. min. 90%
285.54
285.93
412.49
3087.43
3843.02
278.65
110.65
254.63
329.15
313.47
316.56
233.95
2708.87
954.8
ci. max. 90%
331.2
332.4
676.05
3274.93
4086.9
332.87
136.34
311.15
474.86
360.76
367.67
263.57
2851.87
1065.27
geom. mean
260.27
260.16
318.88
3127.01
3872.62
220.54
86.71
174.2
243.43
285.24
286.02
223.44
2737.93
414.26
linear regression slope 90%
0.17 - 0.28
0.1 - 0.21
-0.05 - 0.35
0.83 - 0.97
0.85 - 1.0
0.08 - 0.15
0.02 - 0.05
0.04 - 0.07
0.04 - 0.17
0.12 - 0.2
0.13 - 0.22
0.05 - 0.09
0.6 - 0.72
0.28 - 0.35
ttest equality
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
DIFF
baseline set1 difference
-78.1 %
-85.07 %
-85.14 %
-9.26 %
-6.29 %
-88.39 %
-96.41 %
-93.1 %
-90.02 %
-83.85 %
-83.12 %
-93.17 %
-32.4 %
-67.94 %
ttest p-value
0.0
0.0
0.0
0.0005
0.026
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210514/57edd000/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 41426 bytes
Desc: image001.png
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210514/57edd000/attachment.png>
More information about the Gluster-users
mailing list