[Bugs] [Bug 1478411] Directory listings on fuse mount are very slow due to small number of getdents () entries

Fri Sep 15 13:53:27 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1478411

nh2 <nh2-redhatbugzilla at deditus.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(nh2-redhatbugzill |needinfo?
                   |a at deditus.de)               |

--- Comment #8 from nh2 <nh2-redhatbugzilla at deditus.de> ---
Thanks for your very insightful reply!

I think we should definitely bring this up as a FUSE issue, I can imagine
gluster and other FUSE based software would have much better performance if
they weren't limited to 4 KB per syscall.

Do you know where this PAGE_SIZE limit is implemented, or would you even be
able to file this issue? I know very little about fuse internals and don't feel
prepared to make a high quality issue report on this topic with them yet.

> each getdent call is not a network round trip like you mentioned in initial comment

You are right, for a test I just increased latency tenfold with `tc qdisc add
dev eth0 root netem delay 2ms` and the time of getdents() calls stayed the same
(with an occasional, much slower getdents() call when it had to fetch new
data).

I assumed it was a network roundtrip because the time spent per syscall is
roughly my LAN network roundtrip (0.2 ms), but that just happened to be how
slow the syscalls were independent of the network.

> I would suggest we close this bug on gluster, and raise it in FUSE kernel?

Personally I would prefer if we could keep it open until getdents() performance
is fixed; from a Gluster user's perspective, it is a Gluster problem that
directory listings are slow, and the fact that FUSE plays a role in it is an
implementation detail.
Also, as you say, FUSE allowing to use larger buffer sizes may not be the only
thing needed to improve the performance.

I did a couple more measurements that suggest that there are still large
integer factors unexplained:

Using `strace -f -T -e getdents` on the example program from `man getdents`
(http://man7.org/linux/man-pages/man2/getdents.2.html) with BUF_SIZE changed
from 1024 to 10240 and 131072 (128K), running it against XFS and the fuse mount
like:

  $ gcc getdents-listdir.c -O2 -o listdir
  $ strace -f -T -e getdents ./listdir /data/brick/.../mydir > /dev/null
  $ strace -f -T -e getdents ./listdir /fuse-mount/.../mydir > /dev/null

With `strace -T`, the values in <brackets> is the time spent in the syscall.

Results for BUF_SIZE = 10240:

gluster fuse: getdents(3, /*  20 entries */, 10240) =  1040 <0.000199>
XFS:          getdents(3, /* 195 entries */, 10240) = 10240 <0.000054>

Results for BUF_SIZE = 131072:

gluster fuse: getdents(3, /*   20 entries */, 131072) =   1040 <0.000199>
XFS:          getdents(3, /* 2498 entries */, 131072) = 131072 <0.000620>

This shows that, almost independent of BUF_SIZE, computing bytes per time,

* getdents() performance on XFS is around 190 MB/s
* getdents() performance on gluster fuse is around 5 MB/s

That's almost a 40x performance difference (and as you say, no networking is
involved).

Even when taking into account the mentioned 5x space overhead of
`fuse_direntplus` vs `linux_dirent`, and assuming that 5x space overhead means
5x increased wall time, there's a factor 8x being lost.

Why might an individual getdents() call be that much slower on fuse?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.