[Gluster-users] Confusing lstat() performance

Fri Sep 15 00:45:58 UTC 2017

Hi Niklas,

Out of interest have you tried testing performance with performance.stat-prefetch enabled?

--
Sam McLeod 
@s_mcleod
https://smcleod.net

> On 14 Sep 2017, at 10:42 pm, Niklas Hambüchen <mail at nh2.me> wrote:
> 
> Hi,
> 
> I have a gluster 3.10 volume with a dir with ~1 million small files in
> them, say mounted at /mnt/dir with FUSE, and I'm observing something weird:
> 
> When I list and stat them all using rsync, then the lstat() calls that
> rsync does are incredibly fast (23 microseconds per call on average,
> definitely faster than a network roundtrip between my 3-machine bricks
> connected via Ethernet).
> But when I try to back them up with the `bup` tool
> (https://github.com/bup/bup), which (at least according to strace) does
> the same syscalls as rsync to stat all files, it takes 1700 microseconds
> per lstat() call, and the total time to go over all files is 50x higher.
> 
> These are my strace statistics:
> 
> rsync strace:
> 
>  $ strace -f -w -c rsync -a --dry-run /mnt/dir /tmp/nonexisting
> 
>  % time     seconds  usecs/call     calls    errors syscall
>  ------ ----------- ----------- --------- --------- ----------------
>   52.35   69.682773       26617      2618         1 select
>   39.60   52.712292        1056     49907           getdents
>    8.00   10.655949          11    998102           lstat
>    0.02    0.022890          12      1900           read
>    0.01    0.017219          21       829           write
>    0.01    0.012093          14       868           munmap
>    0.01    0.006656          11       606           mmap
>    0.00    0.004168        1389         3           readlink
>    0.00    0.001796        1796         1           chdir
>    0.00    0.001019         510         2           clone
>    0.00    0.000841          44        19        13 open
> 
> Took ~50 seconds real time to complete.
> 
> bup strace (I interrupted it after a while):
> 
>  strace: Process 10749 attached
>  strace: Process 10750 attached
>  Indexing: 25600 (566 paths/s)
>  ^C
>  % time     seconds  usecs/call     calls    errors syscall
>  ------ ----------- ----------- --------- --------- ----------------
>   89.55 1140.837325        1700    671016           lstat
>    3.92   49.875376         934     53387           getdents
>    3.58   45.655985       52238       874           read
>    2.14   27.293944         789     34588      5799 open
>    0.57    7.266342         141     51384           llistxattr
>    0.09    1.090689          42     25692     25692 getxattr
>    0.06    0.780977          26     29825      1019 close
>    0.05    0.601806          23     25739     25722 ioctl
>    0.03    0.373894        1851       202           select
>    0.00    0.055953          14      3879           brk
> 
>  real    20m52.150s
>  user    0m2.608s
>  sys     0m11.456s
> 
> Note I passed `-c -w` to strace to measure wall time of the syscalls
> spend, not system CPU time.
> 
> Using
> 
>  time strace -f -c -w ls -lU /mnt/dir > /dev/null
> 
> shows that ls is as fast as rsync (also in 50 seconds).
> 
> (Aside: I've filed the large number of getdents() as
> https://bugzilla.redhat.com/show_bug.cgi?id=1478411 but that's not the
> core question of this email.)
> 
> My volume info:
> 
>    # gluster vol info
> 
>    Volume Name: myvol
>    Type: Replicate
>    Volume ID: ...
>    Status: Started
>    Snapshot Count: 0
>    Number of Bricks: 1 x 3 = 3
>    Transport-type: tcp
>    Bricks:
>    Brick1: 10.0.0.1:/data/glusterfs/brick
>    Brick2: 10.0.0.2:/data/glusterfs/brick
>    Brick3: 10.0.0.3:/data/glusterfs/brick
>    Options Reconfigured:
>    changelog.capture-del-path: on
>    changelog.changelog: on
>    storage.build-pgfid: on
>    nfs.disable: on
>    transport.address-family: inet
>    client.ssl: on
>    server.ssl: on
>    storage.linux-aio: on
>    performance.io-thread-count: 64
>    performance.readdir-ahead: on
>    server.event-threads: 32
>    client.event-threads: 32
>    server.outstanding-rpc-limit: 64
>    cluster.lookup-unhashed: auto
>    performance.flush-behind: on
>    performance.strict-write-ordering: off
>    performance.high-prio-threads: 64
>    performance.normal-prio-threads: 64
>    performance.low-prio-threads: 64
>    performance.write-behind-window-size: 10MB
>    cluster.ensure-durability: on
>    performance.lazy-open: yes
>    cluster.use-compound-fops: off
>    performance.open-behind: on
>    features.cache-invalidation: off
>    performance.quick-read: off
>    performance.read-ahead: off
>    performance.stat-prefetch: off
>    changelog.rollover-time: 1
>    cluster.self-heal-daemon: enable
> 
> Questions:
> 
> What could explain why bup's lstat()s are slow and rsync's lstat()s are
> fast?
> 
> Also, how comes rsync's lstat()s can be faster than a network roundtrip
> at all, given that I have all caching disabled (e.g. stat-prefetch: off,
> cache-invalidation: off)?
> 
> Is there some caching going on? Might bup issue its syscalls in an order
> that might destroy this caching, while rsync and ls use a favourable order?
> 
> Thanks!
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170915/c8d15828/attachment.html>