[Bugs] [Bug 1524048] gluster volume set is very slow

bugzilla at redhat.com bugzilla at redhat.com
Sat Dec 9 16:38:22 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1524048



--- Comment #1 from nh2 <nh2-redhatbugzilla at deditus.de> ---
Here's some more info, showing exactly the fsync syscalls that are happening:

With barrier (normal):

[root at host:~]# mount -o remount,barrier /
[root at host:~]# time strace -fp "$(pidof glusterfsd)" -e fsync -tttT
[pid 27543] 1512837193.344489 fsync(33) = 0 <0.098734>
[pid 27543] 1512837193.443479 fsync(34) = 0 <0.060922>
[pid 27543] 1512837193.507358 fsync(33) = 0 <0.061724>
[pid 27543] 1512837193.569290 fsync(37) = 0 <0.083958>
[pid 27543] 1512837193.653654 fsync(34) = 0 <0.100039>
[pid 27543] 1512837193.767362 fsync(33) = 0 <0.099843>
[pid 27543] 1512837193.867350 fsync(35) = 0 <0.059362>
[pid 27543] 1512837193.926851 fsync(33) = 0 <0.037742>
[pid 27543] 1512837193.964734 fsync(32) = 0 <0.077420>
[pid 27543] 1512837194.042367 fsync(33) = 0 <0.054849>
[pid 27543] 1512837194.097981 fsync(32) = 0 <0.100883>
[pid 27543] 1512837194.199456 fsync(33) = 0 <0.059231>
[pid 27675] 1512837194.285334 fsync(32 <unfinished ...>
 <unfinished ...>
[pid 27675] 1512837194.373005 <... fsync resumed> ) = 0 <0.087665>
[pid 27762] 1512837194.373027 <... fsync resumed> ) = 0 <0.082764>
[pid 27675] 1512837194.373138 fsync(33 <unfinished ...>
[pid 27762] 1512837194.373155 fsync(42) = 0 <0.049708>
[pid 27675] 1512837194.422877 <... fsync resumed> ) = 0 <0.049734>
[pid 27675] 1512837194.425988 fsync(32 <unfinished ...>
[pid 27762] 1512837194.428923 fsync(41 <unfinished ...>
[pid 27675] 1512837194.525484 <... fsync resumed> ) = 0 <0.099484>
[pid 27762] 1512837194.525496 <... fsync resumed> ) = 0 <0.096568>
[pid 27675] 1512837194.525595 fsync(35 <unfinished ...>
[pid 27762] 1512837194.525624 fsync(44 <unfinished ...>
[pid 27675] 1512837194.584304 <... fsync resumed> ) = 0 <0.058690>
[pid 27762] 1512837194.584319 <... fsync resumed> ) = 0 <0.058690>
[pid 27675] 1512837194.584436 fsync(33 <unfinished ...>
[pid 27762] 1512837194.584509 fsync(42 <unfinished ...>
[pid 27675] 1512837194.669038 <... fsync resumed> ) = 0 <0.084582>
[pid 27762] 1512837194.669051 <... fsync resumed> ) = 0 <0.084538>
[pid 27675] 1512837194.669289 fsync(32 <unfinished ...>
[pid 27762] 1512837194.669442 fsync(41 <unfinished ...>
[pid 27675] 1512837194.712706 <... fsync resumed> ) = 0 <0.043400>
[pid 27675] 1512837194.712786 fsync(34 <unfinished ...>
[pid 27762] 1512837194.755490 <... fsync resumed> ) = 0 <0.086025>
[pid 27762] 1512837194.755924 fsync(43 <unfinished ...>
[pid 27675] 1512837194.808851 <... fsync resumed> ) = 0 <0.096050>
[pid 27675] 1512837194.809115 fsync(32 <unfinished ...>
[pid 27762] 1512837194.860723 <... fsync resumed> ) = 0 <0.104776>
[pid 27762] 1512837194.861140 fsync(42 <unfinished ...>
[pid 27675] 1512837194.900553 <... fsync resumed> ) = 0 <0.091421>
[pid 27675] 1512837194.900623 fsync(31 <unfinished ...>
[pid 27762] 1512837194.933434 <... fsync resumed> ) = 0 <0.072278>
[pid 27762] 1512837194.933630 fsync(40 <unfinished ...>
[pid 27675] 1512837194.983271 <... fsync resumed> ) = 0 <0.082632>
[pid 27675] 1512837194.983556 fsync(32 <unfinished ...>
[pid 27762] 1512837195.015543 <... fsync resumed> ) = 0 <0.081842>
[pid 27762] 1512837195.015769 fsync(42 <unfinished ...>
[pid 27675] 1512837195.067165 <... fsync resumed> ) = 0 <0.083596>
[pid 27675] 1512837195.086955 fsync(31 <unfinished ...>
[pid 27762] 1512837195.129398 <... fsync resumed> ) = 0 <0.113616>
[pid 27762] 1512837195.130201 fsync(40 <unfinished ...>
[pid 27675] 1512837195.189593 <... fsync resumed> ) = 0 <0.102624>
[pid 27675] 1512837195.189890 fsync(32 <unfinished ...>
[pid 27762] 1512837195.240467 <... fsync resumed> ) = 0 <0.110255>
[pid 27762] 1512837195.240839 fsync(42 <unfinished ...>
[pid 27675] 1512837195.292182 <... fsync resumed> ) = 0 <0.102274>


With nobarrier:

[root at host:~]# mount -o remount,nobarrier /
[root at host:~]# time strace -fp "$(pidof glusterfsd)" -e fsync -tttT
[pid 27543] 1512837034.233502 fsync(33) = 0 <0.000556>
[pid 27543] 1512837034.234154 fsync(34) = 0 <0.000271>
[pid 27543] 1512837034.237067 fsync(33) = 0 <0.005847>
[pid 27543] 1512837034.243019 fsync(36) = 0 <0.000278>
[pid 27543] 1512837034.243329 fsync(34) = 0 <0.008456>
[pid 27543] 1512837034.251897 fsync(33) = 0 <0.000275>
[pid 27543] 1512837034.252211 fsync(35) = 0 <0.008275>
[pid 27543] 1512837034.260567 fsync(33) = 0 <0.000305>
[pid 27543] 1512837034.260925 fsync(32) = 0 <0.008355>
[pid 27543] 1512837034.269354 fsync(33) = 0 <0.000272>
[pid 27543] 1512837034.270136 fsync(32) = 0 <0.008022>
[pid 27543] 1512837034.278313 fsync(33) = 0 <0.000325>
[pid 27675] 1512837034.297006 fsync(32) = 0 <0.000559>
[pid 27675] 1512837034.297675 fsync(33) = 0 <0.000264>
[pid 27675] 1512837034.301958 fsync(32 <unfinished ...>
 <unfinished ...>
[pid 27675] 1512837034.322711 <... fsync resumed> ) = 0 <0.020749>
[pid 27675] 1512837034.322951 fsync(35 <unfinished ...>
[pid 27762] 1512837034.322960 <... fsync resumed> ) = 0 <0.015717>
[pid 27762] 1512837034.323212 fsync(42 <unfinished ...>
[pid 27675] 1512837034.323232 <... fsync resumed> ) = 0 <0.000276>
[pid 27675] 1512837034.323254 fsync(33 <unfinished ...>
[pid 27762] 1512837034.323459 <... fsync resumed> ) = 0 <0.000242>
[pid 27762] 1512837034.326539 fsync(41 <unfinished ...>
[pid 27675] 1512837034.332323 <... fsync resumed> ) = 0 <0.009061>
[pid 27762] 1512837034.332331 <... fsync resumed> ) = 0 <0.005789>
[pid 27675] 1512837034.332428 fsync(32 <unfinished ...>
[pid 27762] 1512837034.332440 fsync(44 <unfinished ...>
[pid 27675] 1512837034.332785 <... fsync resumed> ) = 0 <0.000349>
[pid 27675] 1512837034.332809 fsync(34 <unfinished ...>
[pid 27762] 1512837034.332817 <... fsync resumed> ) = 0 <0.000374>
[pid 27762] 1512837034.332840 fsync(42 <unfinished ...>
[pid 27675] 1512837034.341147 <... fsync resumed> ) = 0 <0.008331>
[pid 27762] 1512837034.341159 <... fsync resumed> ) = 0 <0.008316>
[pid 27675] 1512837034.341198 fsync(32 <unfinished ...>
[pid 27762] 1512837034.341847 fsync(41 <unfinished ...>
[pid 27675] 1512837034.341892 <... fsync resumed> ) = 0 <0.000690>
[pid 27675] 1512837034.341914 fsync(31 <unfinished ...>
[pid 27762] 1512837034.342140 <... fsync resumed> ) = 0 <0.000289>
[pid 27762] 1512837034.342180 fsync(43 <unfinished ...>
[pid 27675] 1512837034.350519 <... fsync resumed> ) = 0 <0.008599>
[pid 27762] 1512837034.350530 <... fsync resumed> ) = 0 <0.008348>
[pid 27675] 1512837034.350572 fsync(32 <unfinished ...>
[pid 27762] 1512837034.350599 fsync(41 <unfinished ...>
[pid 27675] 1512837034.350878 <... fsync resumed> ) = 0 <0.000302>
[pid 27762] 1512837034.350884 <... fsync resumed> ) = 0 <0.000284>
[pid 27762] 1512837034.350926 fsync(40 <unfinished ...>
[pid 27675] 1512837034.351361 fsync(31) = 0 <0.008223>
[pid 27762] 1512837034.359596 <... fsync resumed> ) = 0 <0.008668>
[pid 27675] 1512837034.359649 fsync(32 <unfinished ...>
[pid 27762] 1512837034.359672 fsync(41 <unfinished ...>
[pid 27675] 1512837034.359956 <... fsync resumed> ) = 0 <0.000300>
[pid 27762] 1512837034.359980 <... fsync resumed> ) = 0 <0.000307>
) = 0 <0.007432>
[pid 27762] 1512837034.368545 fsync(41) = 0 <0.000251>


You can see that without nobarrier, the fsync delays add up to the latency I
observe.

Is it really necessary that gluster fsync()s e.g. FD 33 8 times in a row to
execute a `volume set`?

Would it be possible to execut an fsync only at the end of the operation, to
make `gluster volume set` 10x faster?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list