<div dir="ltr">Hi all,<div><br><div>I have seen this issue as well, on Gluster 3.12.1. (3 bricks per box, 2 boxes, distributed-replicate) My testing shows the same thing -- running a find on a directory dramatically increases lstat performance. To add another clue, the performance degrades again after issuing a call to reset the system's cache of dentries and inodes:<br></div></div><div><br></div><div># sync; echo 2 > /proc/sys/vm/drop_caches<br></div><div><br></div><div>I think that this shows that it's the system cache that's actually doing the heavy lifting here. There are a couple of sysctl tunables that I've found helps out with this.</div><div><br></div><div>See here:</div><div><br></div><div><a href="http://docs.gluster.org/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/">http://docs.gluster.org/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/</a><br></div><div><br></div><div>Contrary to what that doc says, I've found that setting vm.vfs_cache_pressure to a low value increases performance by allowing more dentries and inodes to be retained in the cache.</div><div><br></div><div><div># Set the swappiness to avoid swap when possible.</div><div>vm.swappiness = 10</div><div><br></div><div># Set the cache pressure to prefer inode and dentry cache over file cache. This is done to keep as many</div><div># dentries and inodes in cache as possible, which dramatically improves gluster small file performance.</div><div>vm.vfs_cache_pressure = 25</div></div><div><br></div><div>For comparison, my config is:</div><div><br></div><div><div>Volume Name: gv0</div><div>Type: Tier</div><div>Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196</div><div>Status: Started</div><div>Snapshot Count: 13</div><div>Number of Bricks: 8</div><div>Transport-type: tcp</div><div>Hot Tier :</div><div>Hot Tier Type : Replicate</div><div>Number of Bricks: 1 x 2 = 2</div><div>Brick1: gluster2:/data/hot_tier/gv0</div><div>Brick2: gluster1:/data/hot_tier/gv0</div><div>Cold Tier:</div><div>Cold Tier Type : Distributed-Replicate</div><div>Number of Bricks: 3 x 2 = 6</div><div>Brick3: gluster1:/data/brick1/gv0</div><div>Brick4: gluster2:/data/brick1/gv0</div><div>Brick5: gluster1:/data/brick2/gv0</div><div>Brick6: gluster2:/data/brick2/gv0</div><div>Brick7: gluster1:/data/brick3/gv0</div><div>Brick8: gluster2:/data/brick3/gv0</div><div>Options Reconfigured:</div><div>performance.cache-max-file-size: 128MB</div><div>cluster.readdir-optimize: on</div><div>cluster.watermark-hi: 95</div><div>features.ctr-sql-db-cachesize: 262144</div><div>cluster.read-freq-threshold: 5</div><div>cluster.write-freq-threshold: 2</div><div>features.record-counters: on</div><div>cluster.tier-promote-frequency: 15000</div><div>cluster.tier-pause: off</div><div>cluster.tier-compact: on</div><div>cluster.tier-mode: cache</div><div>features.ctr-enabled: on</div><div>performance.cache-refresh-timeout: 60</div><div>performance.stat-prefetch: on</div><div>server.outstanding-rpc-limit: 2056</div><div>cluster.lookup-optimize: on</div><div>performance.client-io-threads: off</div><div>nfs.disable: on</div><div>transport.address-family: inet</div><div>features.barrier: disable</div><div>client.event-threads: 4</div><div>server.event-threads: 4</div><div>performance.cache-size: 1GB</div><div>network.inode-lru-limit: 90000</div><div>performance.md-cache-timeout: 600</div><div>performance.cache-invalidation: on</div><div>features.cache-invalidation-timeout: 600</div><div>features.cache-invalidation: on</div><div>performance.quick-read: on</div><div>performance.io-cache: on</div><div>performance.nfs.write-behind-window-size: 4MB</div><div>performance.write-behind-window-size: 4MB</div><div>performance.nfs.io-threads: off</div><div>network.tcp-window-size: 1048576</div><div>performance.rda-cache-limit: 64MB</div><div>performance.flush-behind: on</div><div>server.allow-insecure: on</div><div>cluster.tier-demote-frequency: 18000</div><div>cluster.tier-max-files: 1000000</div><div>cluster.tier-max-promote-file-size: 10485760</div><div>cluster.tier-max-mb: 64000</div><div>features.ctr-sql-db-wal-autocheckpoint: 2500</div><div>cluster.tier-hot-compact-frequency: 86400</div><div>cluster.tier-cold-compact-frequency: 86400</div><div>performance.readdir-ahead: off</div><div>cluster.watermark-low: 50</div><div>storage.build-pgfid: on</div><div>performance.rda-request-size: 128KB</div><div>performance.rda-low-wmark: 4KB</div><div>cluster.min-free-disk: 5%</div><div>auto-delete: enable</div></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Feb 4, 2018 at 9:44 PM, Amar Tumballi <span dir="ltr"><<a href="mailto:atumball@redhat.com" target="_blank">atumball@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Thanks for the report Artem,<div><br></div><div>Looks like the issue is about cache warming up. Specially, I suspect rsync doing a 'readdir(), stat(), file operations' loop, where as when a find or ls is issued, we get 'readdirp()' request, which contains the stat information along with entries, which also makes sure cache is up-to-date (at md-cache layer).</div><div><br></div><div>Note that this is just a off-the memory hypothesis, We surely need to analyse and debug more thoroughly for a proper explanation. Some one in my team would look at it soon.</div><div><br></div><div>Regards,</div><div>Amar </div></div><div class="gmail_extra"><div><div class="h5"><br><div class="gmail_quote">On Mon, Feb 5, 2018 at 7:25 AM, Vlad Kopylov <span dir="ltr"><<a href="mailto:vladkopy@gmail.com" target="_blank">vladkopy@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">You mounting it to the local bricks?<br>
<br>
struggling with same performance issues<br>
try using this volume setting<br>
<a href="http://lists.gluster.org/pipermail/gluster-users/2018-January/033397.html" rel="noreferrer" target="_blank">http://lists.gluster.org/piper<wbr>mail/gluster-users/2018-Januar<wbr>y/033397.html</a><br>
performance.stat-prefetch: on might be it<br>
<br>
seems like when it gets to cache it is fast - those stat fetch which<br>
seem to come from .gluster are slow<br>
<div><div class="m_-4673891776614962460h5"><br>
On Sun, Feb 4, 2018 at 3:45 AM, Artem Russakovskii <<a href="mailto:archon810@gmail.com" target="_blank">archon810@gmail.com</a>> wrote:<br>
> An update, and a very interesting one!<br>
><br>
> After I started stracing rsync, all I could see was lstat calls, quite slow<br>
> ones, over and over, which is expected.<br>
><br>
> For example: lstat("uploads/2016/10/nexus2c<wbr>ee_DSC05339_thumb-161x107.jpg"<wbr>,<br>
> {st_mode=S_IFREG|0664, st_size=4043, ...}) = 0<br>
><br>
> I googled around and found<br>
> <a href="https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1" rel="noreferrer" target="_blank">https://gist.github.com/nh2/18<wbr>36415489e2132cf85ed3832105fcc1</a><wbr>, which is<br>
> seeing this exact issue with gluster, rsync and xfs.<br>
><br>
> Here's the craziest finding so far. If while rsync is running (or right<br>
> before), I run /bin/ls or find on the same gluster dirs, it immediately<br>
> speeds up rsync by a factor of 100 or maybe even 1000. It's absolutely<br>
> insane.<br>
><br>
> I'm stracing the rsync run, and the slow lstat calls flood in at an<br>
> incredible speed as soon as ls or find run. Several hundred of files per<br>
> minute (excruciatingly slow) becomes thousands or even tens of thousands of<br>
> files a second.<br>
><br>
> What do you make of this?<br>
><br>
><br>
</div></div>> ______________________________<wbr>_________________<br>
> Gluster-users mailing list<br>
> <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
</blockquote></div><br><br clear="all"><div><br></div></div></div><span class="HOEnZb"><font color="#888888">-- <br><div class="m_-4673891776614962460gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Amar Tumballi (amarts)<br></div></div></div></div></div>
</font></span></div>
<br>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br></blockquote></div><br></div>