[Gluster-devel] following up on the work underway for improvements in ls-l - looking for the data on the test runs

Wed Feb 26 06:57:38 UTC 2020

Hi,

Below is the volume configuration that we used to test the patch.

testvol
Type: Distributed-Replicate
Volume ID: 9aa40921-c5ae-416e-a7d2-0a82d53a8b4d
Status: Started
Snapshot Count: 0
Number of Bricks: 24 x 3 = 72
Transport-type: tcp
Bricks:
Brick1: host003.addr.com:/gluster/brick1/testvol
Brick2: host005.addr.com:/gluster/brick1/testvol
Brick3: host008.addr.com:/gluster/brick1/testvol
Brick4: host011.addr.com:/gluster/brick1/testvol
Brick5: host015.addr.com:/gluster/brick1/testvol
Brick6: host016.addr.com:/gluster/brick1/testvol
Brick7: host003.addr.com:/gluster/brick2/testvol
Brick8: host005.addr.com:/gluster/brick2/testvol
Brick9: host008.addr.com:/gluster/brick2/testvol
Brick10: host011.addr.com:/gluster/brick2/testvol
Brick11: host015.addr.com:/gluster/brick2/testvol
Brick12: host016.addr.com:/gluster/brick2/testvol
Brick13: host003.addr.com:/gluster/brick3/testvol
Brick14: host005.addr.com:/gluster/brick3/testvol
Brick15: host008.addr.com:/gluster/brick3/testvol
Brick16: host011.addr.com:/gluster/brick3/testvol
Brick17: host015.addr.com:/gluster/brick3/testvol
Brick18: host016.addr.com:/gluster/brick3/testvol
Brick19: host003.addr.com:/gluster/brick4/testvol
Brick20: host005.addr.com:/gluster/brick4/testvol
Brick21: host008.addr.com:/gluster/brick4/testvol
Brick22: host011.addr.com:/gluster/brick4/testvol
Brick23: host015.addr.com:/gluster/brick4/testvol
Brick24: host016.addr.com:/gluster/brick4/testvol
Brick25: host003.addr.com:/gluster/brick5/testvol
Brick26: host005.addr.com:/gluster/brick5/testvol
Brick27: host008.addr.com:/gluster/brick5/testvol
Brick28: host011.addr.com:/gluster/brick5/testvol
Brick29: host015.addr.com:/gluster/brick5/testvol
Brick30: host016.addr.com:/gluster/brick5/testvol
Brick31: host003.addr.com:/gluster/brick6/testvol
Brick32: host005.addr.com:/gluster/brick6/testvol
Brick33: host008.addr.com:/gluster/brick6/testvol
Brick34: host011.addr.com:/gluster/brick6/testvol
Brick35: host015.addr.com:/gluster/brick6/testvol
Brick36: host016.addr.com:/gluster/brick6/testvol
Brick37: host003.addr.com:/gluster/brick7/testvol
Brick38: host005.addr.com:/gluster/brick7/testvol
Brick39: host008.addr.com:/gluster/brick7/testvol
Brick40: host011.addr.com:/gluster/brick7/testvol
Brick41: host015.addr.com:/gluster/brick7/testvol
Brick42: host016.addr.com:/gluster/brick7/testvol
Brick43: host003.addr.com:/gluster/brick8/testvol
Brick44: host005.addr.com:/gluster/brick8/testvol
Brick45: host008.addr.com:/gluster/brick8/testvol
Brick46: host011.addr.com:/gluster/brick8/testvol
Brick47: host015.addr.com:/gluster/brick8/testvol
Brick48: host016.addr.com:/gluster/brick8/testvol
Brick49: host003.addr.com:/gluster/brick9/testvol
Brick50: host005.addr.com:/gluster/brick9/testvol
Brick51: host008.addr.com:/gluster/brick9/testvol
Brick52: host011.addr.com:/gluster/brick9/testvol
Brick53: host015.addr.com:/gluster/brick9/testvol
Brick54: host016.addr.com:/gluster/brick9/testvol
Brick55: host003.addr.com:/gluster/brick10/testvol
Brick56: host005.addr.com:/gluster/brick10/testvol
Brick57: host008.addr.com:/gluster/brick10/testvol
Brick58: host011.addr.com:/gluster/brick10/testvol
Brick59: host015.addr.com:/gluster/brick10/testvol
Brick60: host016.addr.com:/gluster/brick10/testvol
Brick61: host003.addr.com:/gluster/brick11/testvol
Brick62: host005.addr.com:/gluster/brick11/testvol
Brick63: host008.addr.com:/gluster/brick11/testvol
Brick64: host011.addr.com:/gluster/brick11/testvol
Brick65: host015.addr.com:/gluster/brick11/testvol
Brick66: host016.addr.com:/gluster/brick11/testvol
Brick67: host003.addr.com:/gluster/brick12/testvol
Brick68: host005.addr.com:/gluster/brick12/testvol
Brick69: host008.addr.com:/gluster/brick12/testvol
Brick70: host011.addr.com:/gluster/brick12/testvol
Brick71: host015.addr.com:/gluster/brick12/testvol
Brick72: host016.addr.com:/gluster/brick12/testvol
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
server.event-threads: 4
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: off
performance.client-io-threads: off

For specific to size I did untar linux.tar 600 times, per linux.tar it has
almost 65k files so the total number of files around 39M and size of the
volume is around 600G (per tar size is around 1G).
It will be good if we can run the same for long-duration like around 6000
times.

On Tue, Feb 25, 2020 at 1:31 PM Mohit Agrawal <moagrawa at redhat.com> wrote:

> With these 2 changes, we are getting a good improvement in file creation
> and
> slight improvement in the "ls-l" operation.
>
> We are still working to improve the same.
>
> To validate the same we have executed below script from 6 different
> clients on 24x3 distributed
> replicate environment after enabling performance related option
>
> mkdir /gluster-mount/`hostname`
> date;
> for i in {1..100}
> do
> echo "directory $i is created" `date`
> mkdir /gluster-mount/`hostname`/dir$i
> tar -xvf /root/kernel_src/linux-5.4-rc8.tar.xz -C
> /gluster-mount/`hostname`/dir$i >/dev/null
> done
>
> With no Patch
> tar was taking almost 36-37 hours
>
> With Patch
> tar is taking almost 26 hours
>
> We were getting a similar kind of improvement in smallfile tool also.
>
> On Tue, Feb 25, 2020 at 1:29 PM Mohit Agrawal <moagrawa at redhat.com> wrote:
>
>> Hi,
>> We observed performance is mainly hurt while .glusterfs is having huge
>> data.As we know before executing a fop in POSIX xlator it builds an
>> internal path based on GFID.To validate the path it call's (l)stat system
>> call and while .glusterfs is heavily loaded kernel takes time to lookup
>> inode and due to that performance drops
>> To improve the same we tried two things with this patch(
>> https://review.gluster.org/#/c/glusterfs/+/23783/)
>>
>> 1) To keep the first level entry always in a cache so that inode lookup
>> will be faster       we have to keep open first level fds(00 to ff total
>> 256) per brick at the time of starting a brick process. Even in case of
>> cache cleanup kernel will not evict first level fds from the cache and
>> performance will improve
>>
>> 2) We tried using "at" based call(lstatat,fstatat,readlinat etc) instead
>> of accessing complete path access relative path, these call's were also
>> helpful to improve performance.
>>
>> Regards,
>> Mohit Agrawal
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20200226/a0cb1c5c/attachment.html>