<div dir="ltr"><div dir="ltr">Hi,<br></div><div dir="ltr"><br></div><div dir="ltr">Below is the volume configuration that we used to test the patch.<br><br>testvol<br>Type: Distributed-Replicate<br>Volume ID: 9aa40921-c5ae-416e-a7d2-0a82d53a8b4d<br>Status: Started<br>Snapshot Count: 0<br>Number of Bricks: 24 x 3 = 72<br>Transport-type: tcp<br>Bricks:<br>Brick1: host003.addr.com:/gluster/brick1/testvol<br>Brick2: host005.addr.com:/gluster/brick1/testvol<br>Brick3: host008.addr.com:/gluster/brick1/testvol<br>Brick4: host011.addr.com:/gluster/brick1/testvol<br>Brick5: host015.addr.com:/gluster/brick1/testvol<br>Brick6: host016.addr.com:/gluster/brick1/testvol<br>Brick7: host003.addr.com:/gluster/brick2/testvol<br>Brick8: host005.addr.com:/gluster/brick2/testvol<br>Brick9: host008.addr.com:/gluster/brick2/testvol<br>Brick10: host011.addr.com:/gluster/brick2/testvol<br>Brick11: host015.addr.com:/gluster/brick2/testvol<br>Brick12: host016.addr.com:/gluster/brick2/testvol<br>Brick13: host003.addr.com:/gluster/brick3/testvol<br>Brick14: host005.addr.com:/gluster/brick3/testvol<br>Brick15: host008.addr.com:/gluster/brick3/testvol<br>Brick16: host011.addr.com:/gluster/brick3/testvol<br>Brick17: host015.addr.com:/gluster/brick3/testvol<br>Brick18: host016.addr.com:/gluster/brick3/testvol<br>Brick19: host003.addr.com:/gluster/brick4/testvol<br>Brick20: host005.addr.com:/gluster/brick4/testvol<br>Brick21: host008.addr.com:/gluster/brick4/testvol<br>Brick22: host011.addr.com:/gluster/brick4/testvol<br>Brick23: host015.addr.com:/gluster/brick4/testvol<br>Brick24: host016.addr.com:/gluster/brick4/testvol<br>Brick25: host003.addr.com:/gluster/brick5/testvol<br>Brick26: host005.addr.com:/gluster/brick5/testvol<br>Brick27: host008.addr.com:/gluster/brick5/testvol<br>Brick28: host011.addr.com:/gluster/brick5/testvol<br>Brick29: host015.addr.com:/gluster/brick5/testvol<br>Brick30: host016.addr.com:/gluster/brick5/testvol<br>Brick31: host003.addr.com:/gluster/brick6/testvol<br>Brick32: host005.addr.com:/gluster/brick6/testvol<br>Brick33: host008.addr.com:/gluster/brick6/testvol<br>Brick34: host011.addr.com:/gluster/brick6/testvol<br>Brick35: host015.addr.com:/gluster/brick6/testvol<br>Brick36: host016.addr.com:/gluster/brick6/testvol<br>Brick37: host003.addr.com:/gluster/brick7/testvol<br>Brick38: host005.addr.com:/gluster/brick7/testvol<br>Brick39: host008.addr.com:/gluster/brick7/testvol<br>Brick40: host011.addr.com:/gluster/brick7/testvol<br>Brick41: host015.addr.com:/gluster/brick7/testvol<br>Brick42: host016.addr.com:/gluster/brick7/testvol<br>Brick43: host003.addr.com:/gluster/brick8/testvol<br>Brick44: host005.addr.com:/gluster/brick8/testvol<br>Brick45: host008.addr.com:/gluster/brick8/testvol<br>Brick46: host011.addr.com:/gluster/brick8/testvol<br>Brick47: host015.addr.com:/gluster/brick8/testvol<br>Brick48: host016.addr.com:/gluster/brick8/testvol<br>Brick49: host003.addr.com:/gluster/brick9/testvol<br>Brick50: host005.addr.com:/gluster/brick9/testvol<br>Brick51: host008.addr.com:/gluster/brick9/testvol<br>Brick52: host011.addr.com:/gluster/brick9/testvol<br>Brick53: host015.addr.com:/gluster/brick9/testvol<br>Brick54: host016.addr.com:/gluster/brick9/testvol<br>Brick55: host003.addr.com:/gluster/brick10/testvol<br>Brick56: host005.addr.com:/gluster/brick10/testvol<br>Brick57: host008.addr.com:/gluster/brick10/testvol<br>Brick58: host011.addr.com:/gluster/brick10/testvol<br>Brick59: host015.addr.com:/gluster/brick10/testvol<br>Brick60: host016.addr.com:/gluster/brick10/testvol<br>Brick61: host003.addr.com:/gluster/brick11/testvol<br>Brick62: host005.addr.com:/gluster/brick11/testvol<br>Brick63: host008.addr.com:/gluster/brick11/testvol<br>Brick64: host011.addr.com:/gluster/brick11/testvol<br>Brick65: host015.addr.com:/gluster/brick11/testvol<br>Brick66: host016.addr.com:/gluster/brick11/testvol<br>Brick67: host003.addr.com:/gluster/brick12/testvol<br>Brick68: host005.addr.com:/gluster/brick12/testvol<br>Brick69: host008.addr.com:/gluster/brick12/testvol<br>Brick70: host011.addr.com:/gluster/brick12/testvol<br>Brick71: host015.addr.com:/gluster/brick12/testvol<br>Brick72: host016.addr.com:/gluster/brick12/testvol<br>Options Reconfigured:<br>diagnostics.count-fop-hits: on<br>diagnostics.latency-measurement: on<br>server.event-threads: 4<br>client.event-threads: 4<br>cluster.lookup-optimize: on<br>network.inode-lru-limit: 200000<br>performance.md-cache-timeout: 600<br>performance.cache-invalidation: on<br>performance.stat-prefetch: on<br>features.cache-invalidation-timeout: 600<br>features.cache-invalidation: on<br>storage.fips-mode-rchecksum: on<br>transport.address-family: inet<br>nfs.disable: off<br>performance.client-io-threads: off<br><br><br>For specific to size I did untar linux.tar 600 times, per linux.tar it has almost 65k files so the total number of files around 39M and size of the<br>volume is around 600G (per tar size is around 1G).<br>It will be good if we can run the same for long-duration like around 6000 times.</div><br class="gmail-Apple-interchange-newline"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 25, 2020 at 1:31 PM Mohit Agrawal &lt;<a href="mailto:moagrawa@redhat.com">moagrawa@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">With these 2 changes, we are getting a good improvement in file creation and <div>slight improvement in the &quot;ls-l&quot; operation.<div><br></div><div>We are still working to improve the same.<br><br>To validate the same we have executed below script from 6 different clients on 24x3 distributed<br>replicate environment after enabling performance related option  <br><br>mkdir /gluster-mount/`hostname`<br>date;<br>for i in {1..100}<br>do<br>echo &quot;directory $i is created&quot; `date`<br>mkdir /gluster-mount/`hostname`/dir$i<br>tar -xvf /root/kernel_src/linux-5.4-rc8.tar.xz -C /gluster-mount/`hostname`/dir$i &gt;/dev/null<br>done<br><br>With no Patch<br>tar was taking almost 36-37 hours<br><br>With Patch<br>tar is taking almost 26 hours<br><br>We were getting a similar kind of improvement in smallfile tool also.<br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 25, 2020 at 1:29 PM Mohit Agrawal &lt;<a href="mailto:moagrawa@redhat.com" target="_blank">moagrawa@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi,<br></div>We observed performance is mainly hurt while .glusterfs is having huge data.As we know before executing a fop in POSIX xlator it builds an internal path based on GFID.To validate the path it call&#39;s (l)stat system call and while .glusterfs is heavily loaded kernel takes time to lookup inode and due to that performance drops<br>To improve the same we tried two things with this patch(<a href="https://review.gluster.org/#/c/glusterfs/+/23783/" target="_blank">https://review.gluster.org/#/c/glusterfs/+/23783/</a>)<br><br>1) To keep the first level entry always in a cache so that inode lookup will be faster       we have to keep open first level fds(00 to ff total 256) per brick at the time of starting a brick process. Even in case of cache cleanup kernel will not evict first level fds from the cache and performance will improve<br><br>2) We tried using &quot;at&quot; based call(lstatat,fstatat,readlinat etc) instead of accessing complete path access relative path, these call&#39;s were also helpful to improve performance.<br><div><br></div><div>Regards,</div><div>Mohit Agrawal</div><div><br></div><div><pre style="white-space:pre-wrap;color:rgb(0,0,0)"><br></pre></div></div>
</blockquote></div>
</blockquote></div></div>