[Gluster-users] It appears that readdir is not cached for FUSE mounts

Mon Feb 10 15:21:42 UTC 2020

On February 10, 2020 2:25:17 PM GMT+02:00, Matthias Schniedermeyer <matthias-gluster-users at maxcluster.de> wrote:
>Hi
>
>
>I would describe our basic use case for gluster as:
>"data-store for a cold-standby application".
>
>A specific application is installed on 2 hardware machines, the data is
>kept in-sync between the 2 machines by a replica-2 gluster volume.
>(IOW: "RAID 1")
>
>At any one time only 1 machine has the volume mounted and the
>application running. If the machine goes down the application is
>started
>on the remaining machine.
>IOW at any one point in time there is only ever 1 "reader & writer"
>running.
>
>I profiled a performance problem we have with this application, which
>unfortunately we can't modify.
>
>The profile shows many "opendir/readdirp/releasedir" cycles, the
>directory in question has about 1000 files and the application "stalls"
>for several milliseconds any time it decides to do a readdir.
>The volume is mounted via FUSE and it appears that said operation is
>not
>cached at all.
>
>To provide a test-case i tried to replicate what the application does.
>The problematic operation is nearly perfectly emulated just by using
>"ls .".
>
>I created a script that replicates how we use gluster and demonstrates
>that a FUSE-mount appears to be lacking any caching of readdir.
>
>A word about the test-environment:
>2 identical servers
>Dual Socket Xeon CPU E5-2640 v3 (8 cores, 2.60GHz, HT enabled)
>RAM: 128GB DDR4 ECC (8x16GB)
>Storage: 2TB Intel P3520 PCIe-NVMe-SSD
>Network: Gluster: 10GB/s direct connect (no switch), external: 1Gbit/s
>OS: CentOS 7.7, Installed with "Minimal" ISO, everything: Default
>Up2Date as of: 2020-01-21 (Kernel: 3.10.0-1062.9.1.el7.x86_64)
>SELinux: Disabled
>SSH-Key for 1 -> 2 exchanged
>Gluster 6.7 packages installed via 'centos-release-gluster6'
>
>see attached: gluster-testcase-no-caching-of-dir-operations-for-fuse.sh
>
>The meat of the testcase is this:
>a profile of:
>ls .
>vs:
>ls . . . . . . . . . .
>(10 dots)
>
> > cat /root/profile-1-times | grep DIR | head -n 3
>0.00       0.00 us       0.00 us       0.00 us              1 
>RELEASEDIR
> 0.27      66.79 us      66.79 us      66.79 us              1  OPENDIR
>98.65   12190.30 us    9390.88 us   14989.73 us              2 
>READDIRP
>
> > cat /root/profile-10-times | grep DIR | head -n 3
>0.00       0.00 us       0.00 us       0.00 us             10 
>RELEASEDIR
> 0.64     108.02 us      85.72 us     131.96 us             10  OPENDIR
>99.36    8388.64 us    5174.71 us   14808.77 us             20 
>READDIRP
>
>This testcase shows perfect scaling.
>10 times the request, results in 10 times the gluster-operations.
>
>I would say ideally there should be no difference in the number of
>gluster-operations, regardless of how often a directory is read in a
>short amount of time (with no changes in between)
>
>
>Is there something we can do to enable caching or otherwise improve
>performance?

Hi Matthias,

Have you tried the 'readdir-ahead' option .
According to docs it is useful for ' improving sequential directory read performance' .
I'm not sure how gluster defines sequential directory read, but it's worth trying.
Also, you can try metadata caching , as described in:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/sect-directory_operations
The actual group should contain the following:
https://github.com/gluster/glusterfs/blob/master/extras/group-metadata-cache

Best Regards,
Strahil Nikolov