<div dir="ltr"><div dir="ltr">Hi,<br></div><div dir="ltr"><br></div><div dir="ltr">Below is the volume configuration that we used to test the patch.<br><br>testvol<br>Type: Distributed-Replicate<br>Volume ID: 9aa40921-c5ae-416e-a7d2-0a82d53a8b4d<br>Status: Started<br>Snapshot Count: 0<br>Number of Bricks: 24 x 3 = 72<br>Transport-type: tcp<br>Bricks:<br>Brick1: host003.addr.com:/gluster/brick1/testvol<br>Brick2: host005.addr.com:/gluster/brick1/testvol<br>Brick3: host008.addr.com:/gluster/brick1/testvol<br>Brick4: host011.addr.com:/gluster/brick1/testvol<br>Brick5: host015.addr.com:/gluster/brick1/testvol<br>Brick6: host016.addr.com:/gluster/brick1/testvol<br>Brick7: host003.addr.com:/gluster/brick2/testvol<br>Brick8: host005.addr.com:/gluster/brick2/testvol<br>Brick9: host008.addr.com:/gluster/brick2/testvol<br>Brick10: host011.addr.com:/gluster/brick2/testvol<br>Brick11: host015.addr.com:/gluster/brick2/testvol<br>Brick12: host016.addr.com:/gluster/brick2/testvol<br>Brick13: host003.addr.com:/gluster/brick3/testvol<br>Brick14: host005.addr.com:/gluster/brick3/testvol<br>Brick15: host008.addr.com:/gluster/brick3/testvol<br>Brick16: host011.addr.com:/gluster/brick3/testvol<br>Brick17: host015.addr.com:/gluster/brick3/testvol<br>Brick18: host016.addr.com:/gluster/brick3/testvol<br>Brick19: host003.addr.com:/gluster/brick4/testvol<br>Brick20: host005.addr.com:/gluster/brick4/testvol<br>Brick21: host008.addr.com:/gluster/brick4/testvol<br>Brick22: host011.addr.com:/gluster/brick4/testvol<br>Brick23: host015.addr.com:/gluster/brick4/testvol<br>Brick24: host016.addr.com:/gluster/brick4/testvol<br>Brick25: host003.addr.com:/gluster/brick5/testvol<br>Brick26: host005.addr.com:/gluster/brick5/testvol<br>Brick27: host008.addr.com:/gluster/brick5/testvol<br>Brick28: host011.addr.com:/gluster/brick5/testvol<br>Brick29: host015.addr.com:/gluster/brick5/testvol<br>Brick30: host016.addr.com:/gluster/brick5/testvol<br>Brick31: host003.addr.com:/gluster/brick6/testvol<br>Brick32: host005.addr.com:/gluster/brick6/testvol<br>Brick33: host008.addr.com:/gluster/brick6/testvol<br>Brick34: host011.addr.com:/gluster/brick6/testvol<br>Brick35: host015.addr.com:/gluster/brick6/testvol<br>Brick36: host016.addr.com:/gluster/brick6/testvol<br>Brick37: host003.addr.com:/gluster/brick7/testvol<br>Brick38: host005.addr.com:/gluster/brick7/testvol<br>Brick39: host008.addr.com:/gluster/brick7/testvol<br>Brick40: host011.addr.com:/gluster/brick7/testvol<br>Brick41: host015.addr.com:/gluster/brick7/testvol<br>Brick42: host016.addr.com:/gluster/brick7/testvol<br>Brick43: host003.addr.com:/gluster/brick8/testvol<br>Brick44: host005.addr.com:/gluster/brick8/testvol<br>Brick45: host008.addr.com:/gluster/brick8/testvol<br>Brick46: host011.addr.com:/gluster/brick8/testvol<br>Brick47: host015.addr.com:/gluster/brick8/testvol<br>Brick48: host016.addr.com:/gluster/brick8/testvol<br>Brick49: host003.addr.com:/gluster/brick9/testvol<br>Brick50: host005.addr.com:/gluster/brick9/testvol<br>Brick51: host008.addr.com:/gluster/brick9/testvol<br>Brick52: host011.addr.com:/gluster/brick9/testvol<br>Brick53: host015.addr.com:/gluster/brick9/testvol<br>Brick54: host016.addr.com:/gluster/brick9/testvol<br>Brick55: host003.addr.com:/gluster/brick10/testvol<br>Brick56: host005.addr.com:/gluster/brick10/testvol<br>Brick57: host008.addr.com:/gluster/brick10/testvol<br>Brick58: host011.addr.com:/gluster/brick10/testvol<br>Brick59: host015.addr.com:/gluster/brick10/testvol<br>Brick60: host016.addr.com:/gluster/brick10/testvol<br>Brick61: host003.addr.com:/gluster/brick11/testvol<br>Brick62: host005.addr.com:/gluster/brick11/testvol<br>Brick63: host008.addr.com:/gluster/brick11/testvol<br>Brick64: host011.addr.com:/gluster/brick11/testvol<br>Brick65: host015.addr.com:/gluster/brick11/testvol<br>Brick66: host016.addr.com:/gluster/brick11/testvol<br>Brick67: host003.addr.com:/gluster/brick12/testvol<br>Brick68: host005.addr.com:/gluster/brick12/testvol<br>Brick69: host008.addr.com:/gluster/brick12/testvol<br>Brick70: host011.addr.com:/gluster/brick12/testvol<br>Brick71: host015.addr.com:/gluster/brick12/testvol<br>Brick72: host016.addr.com:/gluster/brick12/testvol<br>Options Reconfigured:<br>diagnostics.count-fop-hits: on<br>diagnostics.latency-measurement: on<br>server.event-threads: 4<br>client.event-threads: 4<br>cluster.lookup-optimize: on<br>network.inode-lru-limit: 200000<br>performance.md-cache-timeout: 600<br>performance.cache-invalidation: on<br>performance.stat-prefetch: on<br>features.cache-invalidation-timeout: 600<br>features.cache-invalidation: on<br>storage.fips-mode-rchecksum: on<br>transport.address-family: inet<br>nfs.disable: off<br>performance.client-io-threads: off<br><br><br>For specific to size I did untar linux.tar 600 times, per linux.tar it has almost 65k files so the total number of files around 39M and size of the<br>volume is around 600G (per tar size is around 1G).<br>It will be good if we can run the same for long-duration like around 6000 times.</div><br class="gmail-Apple-interchange-newline"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 25, 2020 at 1:31 PM Mohit Agrawal <<a href="mailto:moagrawa@redhat.com">moagrawa@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">With these 2 changes, we are getting a good improvement in file creation and <div>slight improvement in the "ls-l" operation.<div><br></div><div>We are still working to improve the same.<br><br>To validate the same we have executed below script from 6 different clients on 24x3 distributed<br>replicate environment after enabling performance related option <br><br>mkdir /gluster-mount/`hostname`<br>date;<br>for i in {1..100}<br>do<br>echo "directory $i is created" `date`<br>mkdir /gluster-mount/`hostname`/dir$i<br>tar -xvf /root/kernel_src/linux-5.4-rc8.tar.xz -C /gluster-mount/`hostname`/dir$i >/dev/null<br>done<br><br>With no Patch<br>tar was taking almost 36-37 hours<br><br>With Patch<br>tar is taking almost 26 hours<br><br>We were getting a similar kind of improvement in smallfile tool also.<br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 25, 2020 at 1:29 PM Mohit Agrawal <<a href="mailto:moagrawa@redhat.com" target="_blank">moagrawa@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi,<br></div>We observed performance is mainly hurt while .glusterfs is having huge data.As we know before executing a fop in POSIX xlator it builds an internal path based on GFID.To validate the path it call's (l)stat system call and while .glusterfs is heavily loaded kernel takes time to lookup inode and due to that performance drops<br>To improve the same we tried two things with this patch(<a href="https://review.gluster.org/#/c/glusterfs/+/23783/" target="_blank">https://review.gluster.org/#/c/glusterfs/+/23783/</a>)<br><br>1) To keep the first level entry always in a cache so that inode lookup will be faster we have to keep open first level fds(00 to ff total 256) per brick at the time of starting a brick process. Even in case of cache cleanup kernel will not evict first level fds from the cache and performance will improve<br><br>2) We tried using "at" based call(lstatat,fstatat,readlinat etc) instead of accessing complete path access relative path, these call's were also helpful to improve performance.<br><div><br></div><div>Regards,</div><div>Mohit Agrawal</div><div><br></div><div><pre style="white-space:pre-wrap;color:rgb(0,0,0)"><br></pre></div></div>
</blockquote></div>
</blockquote></div></div>