[Gluster-users] [External] Re: file metadata operations performance - gluster 4.1
Davide Obbi
davide.obbi at booking.com
Fri Aug 31 09:52:00 UTC 2018
#gluster vol set VOLNAME group nl-cache --> didn't know there are groups of
options, after this command i got set the following:
performance.nl-cache-timeout: 600
performance.nl-cache: on
performance.parallel-readdir: on
performance.io-thread-count: 64
network.inode-lru-limit: 200000
to note that i had network.inode-lru-limit set to max and got reduced to
200000
then i added
performance.nl-cache-positive-entry: on
The volume options:
Options Reconfigured:
performance.nl-cache-timeout: 600
performance.nl-cache: on
performance.nl-cache-positive-entry: on
performance.parallel-readdir: on
performance.io-thread-count: 64
network.inode-lru-limit: 200000
nfs.disable: on
transport.address-family: inet
performance.readdir-ahead: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.md-cache-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.cache-size: 10GB
network.ping-timeout: 5
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
features.quota: off
features.inode-quota: off
performance.quick-read: on
untar completed in 08mins 30secs
increasing network.inode-lru-limit to 1048576 untar completed in the same
time
I have attached the gluster profile results of the last test, with
network.inode-lru-limit to 1048576
I guess the next test will be creating more bricks for the same volume to
have a 2x3. Since i do not see bottlenecks at the disk level and i have
limited hw ATM i will just carve out the bricks from LVs from the same 1
disk VG.
Also i have tried to look for a complete list of options/description
unsuccessfully can you point at one?
thanks
Davide
On Thu, Aug 30, 2018 at 5:47 PM Poornima Gurusiddaiah <pgurusid at redhat.com>
wrote:
> To enable nl-cache please use group option instead of single volume set:
>
> #gluster vol set VOLNAME group nl-cache
>
> This sets few other things including time out, invalidation etc.
>
> For enabling the option Raghavendra mentioned, you ll have to execute it
> explicitly, as it's not part of group option yet:
>
> #gluster vol set VOLNAME performance.nl-cache-positive-entry on
>
> Also from the past experience, setting the below option has helped in
> performance:
>
> # gluster vol set VOLNAME network.inode-lru-limit 200000
>
> Regards,
> Poornima
>
>
> On Thu, Aug 30, 2018, 8:49 PM Raghavendra Gowdappa <rgowdapp at redhat.com>
> wrote:
>
>>
>>
>> On Thu, Aug 30, 2018 at 8:38 PM, Davide Obbi <davide.obbi at booking.com>
>> wrote:
>>
>>> yes "performance.parallel-readdir on and 1x3 replica
>>>
>>
>> That's surprising. I thought performance.parallel-readdir will help only
>> when distribute count is fairly high. This is something worth investigating
>> further.
>>
>>
>>> On Thu, Aug 30, 2018 at 5:00 PM Raghavendra Gowdappa <
>>> rgowdapp at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Aug 30, 2018 at 8:08 PM, Davide Obbi <davide.obbi at booking.com>
>>>> wrote:
>>>>
>>>>> Thanks Amar,
>>>>>
>>>>> i have enabled the negative lookups cache on the volume:
>>>>>
>>>>
>> I think enabling nl-cache-positive-entry might help for untarring or git
>> clone into glusterfs. Its disabled by default. can you let us know the
>> results?
>>
>> Option: performance.nl-cache-positive-entry
>> Default Value: (null)
>> Description: enable/disable storing of entries that were lookedup and
>> found to be present in the volume, thus lookup on non existent file is
>> served from the cache
>>
>>
>>>>> To deflate a tar archive (not compressed) of 1.3GB it takes aprox
>>>>> 9mins which can be considered a slight improvement from the previous 12-15
>>>>> however still not fast enough compared to local disk. The tar is present on
>>>>> the gluster share/volume and deflated inside the same folder structure.
>>>>>
>>>>
>>>> I am assuming this is with parallel-readdir enabled, right?
>>>>
>>>>
>>>>> Running the operation twice (without removing the already deflated
>>>>> files) also did not reduce the time spent.
>>>>>
>>>>> Running the operation with the tar archive on local disk made no
>>>>> difference
>>>>>
>>>>> What really made a huge difference while git cloning was setting
>>>>> "performance.parallel-readdir on". During the phase "Receiving objects" ,
>>>>> as i enabled the xlator it bumped up from 3/4MBs to 27MBs
>>>>>
>>>>
>>>> What is the distribute count? Is it 1x3 replica?
>>>>
>>>>
>>>>> So in conclusion i'm trying to make the untar operation working at an
>>>>> acceptable level, not expecting local disks speed but at least being within
>>>>> the 4mins
>>>>>
>>>>> I have attached the profiles collected at the end of the untar
>>>>> operations with the archive on the mount and outside
>>>>>
>>>>> thanks
>>>>> Davide
>>>>>
>>>>>
>>>>> On Tue, Aug 28, 2018 at 8:41 AM Amar Tumballi <atumball at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> One of the observation we had with git clone like work load was,
>>>>>> nl-cache (negative-lookup cache), helps here.
>>>>>>
>>>>>> Try 'gluster volume set $volume-name nl-cache enable'.
>>>>>>
>>>>>> Also sharing the 'profile info' during this performance observations
>>>>>> also helps us to narrow down the situation.
>>>>>>
>>>>>> More on how to capture profile info @
>>>>>> https://hackmd.io/PhhT5jPdQIKxzfeLQmnjJQ?view
>>>>>>
>>>>>> -Amar
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 23, 2018 at 7:11 PM, Davide Obbi <davide.obbi at booking.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> did anyone ever managed to achieve reasonable waiting time while
>>>>>>> performing metadata intensive operations such as git clone, untar etc...?
>>>>>>> Is this possible workload or will never be in scope for glusterfs?
>>>>>>>
>>>>>>> I'd like to know, if possible, what would be the options that affect
>>>>>>> such volume performances.
>>>>>>> Albeit i managed to achieve decent git status/git grep operations, 3
>>>>>>> and 30 secs, the git clone and untarring a file from/to the same share take
>>>>>>> ages. for a git repo of aprox 6GB.
>>>>>>>
>>>>>>> I'm running a test environment with 3 way replica 128GB RAM and 24
>>>>>>> cores are 2.40GHz, one internal SSD dedicated to the volume brick and 10Gb
>>>>>>> network
>>>>>>>
>>>>>>> The options set so far that affects volume performances are:
>>>>>>> 48 performance.readdir-ahead: on
>>>>>>> 49 features.cache-invalidation-timeout: 600
>>>>>>> 50 features.cache-invalidation: on
>>>>>>> 51 performance.md-cache-timeout: 600
>>>>>>> 52 performance.stat-prefetch: on
>>>>>>> 53 performance.cache-invalidation: on
>>>>>>> 54 performance.parallel-readdir: on
>>>>>>> 55 network.inode-lru-limit: 900000
>>>>>>> 56 performance.io-thread-count: 32
>>>>>>> 57 performance.cache-size: 10GB
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Amar Tumballi (amarts)
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Davide Obbi
>>>>> System Administrator
>>>>>
>>>>> Booking.com B.V.
>>>>> Vijzelstraat 66
>>>>> <https://maps.google.com/?q=Vijzelstraat+66&entry=gmail&source=g>-80
>>>>> Amsterdam 1017HL Netherlands
>>>>> Direct +31207031558
>>>>> [image: Booking.com] <https://www.booking.com/>
>>>>> The world's #1 accommodation site
>>>>> 43 languages, 198+ offices worldwide, 120,000+ global destinations,
>>>>> 1,550,000+ room nights booked every day
>>>>> No booking fees, best price always guaranteed
>>>>> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Davide Obbi
>>> System Administrator
>>>
>>> Booking.com B.V.
>>> Vijzelstraat 66
>>> <https://maps.google.com/?q=Vijzelstraat+66&entry=gmail&source=g>-80
>>> Amsterdam 1017HL Netherlands
>>> Direct +31207031558
>>> [image: Booking.com] <https://www.booking.com/>
>>> The world's #1 accommodation site
>>> 43 languages, 198+ offices worldwide, 120,000+ global destinations,
>>> 1,550,000+ room nights booked every day
>>> No booking fees, best price always guaranteed
>>> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
>>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
--
Davide Obbi
System Administrator
Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180831/edc73771/attachment.html>
-------------- next part --------------
Brick: glusterserver-1005:/srv/gfs/test01/brk01/brick
-----------------------------------------------------
Cumulative Stats:
Block Size: 1b+ 2b+ 4b+
No. of Reads: 0 0 3
No. of Writes: 294 116 1096
Block Size: 8b+ 16b+ 32b+
No. of Reads: 4 10 7
No. of Writes: 2557 7882 15309
Block Size: 64b+ 128b+ 256b+
No. of Reads: 2 4 7
No. of Writes: 36962 45448 81464
Block Size: 512b+ 1024b+ 2048b+
No. of Reads: 4 10 10
No. of Writes: 172545 242644 248911
Block Size: 4096b+ 8192b+ 16384b+
No. of Reads: 6 4 1
No. of Writes: 328818 219934 113897
Block Size: 32768b+ 65536b+ 131072b+
No. of Reads: 5 3 97791
No. of Writes: 30425 29940 3538
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
0.00 0.00 us 0.00 us 0.00 us 896825 FORGET
0.00 0.00 us 0.00 us 0.00 us 924635 RELEASE
0.00 0.00 us 0.00 us 0.00 us 339641 RELEASEDIR
0.00 29.66 us 21.47 us 43.39 us 3 IPC
0.00 79.07 us 65.72 us 87.75 us 3 OPEN
0.00 134.74 us 101.38 us 164.15 us 4 XATTROP
0.00 59.56 us 30.56 us 119.36 us 28 STAT
0.00 147.49 us 8.67 us 20057.07 us 474 GETXATTR
0.01 201.57 us 122.66 us 831.05 us 708 READDIR
0.02 53.68 us 19.37 us 2509.50 us 4269 STATFS
0.55 61.48 us 1.23 us 4126.95 us 129250 OPENDIR
0.69 236.16 us 61.19 us 450562.83 us 42307 RMDIR
0.93 311.30 us 73.79 us 22016.10 us 43259 READDIRP
0.97 46.57 us 12.83 us 3028.93 us 300094 FSTAT
1.05 786.11 us 78.71 us 1608062.77 us 19308 SYMLINK
1.12 79.61 us 34.32 us 18196.28 us 202935 SETXATTR
1.91 34.59 us 8.54 us 3191.28 us 794395 FLUSH
2.51 356.26 us 102.22 us 937208.95 us 101455 MKDIR
3.85 35.95 us 9.20 us 98426.35 us 1544860 FINODELK
4.26 1558.86 us 34.30 us 2179743.97 us 39390 READ
5.98 82.55 us 23.84 us 33131.27 us 1044119 WRITE
6.64 91.22 us 27.44 us 106627.80 us 1049164 SETATTR
7.59 218.99 us 56.19 us 792203.93 us 499616 UNLINK
8.59 132.48 us 13.22 us 181848.34 us 933898 LOOKUP
8.73 39.29 us 6.58 us 105988.10 us 3201734 ENTRYLK
8.96 42.81 us 8.65 us 3840.32 us 3014442 INODELK
13.83 129.04 us 56.01 us 855297.69 us 1544872 FXATTROP
21.79 395.23 us 84.43 us 2698257.78 us 794395 CREATE
0.00 0.00 us 0.00 us 0.00 us 757637 UPCALL
0.00 0.00 us 0.00 us 0.00 us 3 CI_IATT
0.00 0.00 us 0.00 us 0.00 us 757634 CI_FORGET
Duration: 75917 seconds
Data Read: 12818311007 bytes
Data Written: 12043293853 bytes
Interval 54 Stats:
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
15.47 22.05 us 22.05 us 22.05 us 1 INODELK
30.23 43.10 us 43.10 us 43.10 us 1 STATFS
54.30 77.42 us 77.42 us 77.42 us 1 SETXATTR
Duration: 9 seconds
Data Read: 0 bytes
Data Written: 0 bytes
Brick: glusterserver-1008:/srv/gfs/test01/brk02/brick
-----------------------------------------------------
Cumulative Stats:
Block Size: 1b+ 2b+ 4b+
No. of Reads: 0 0 3
No. of Writes: 294 116 1096
Block Size: 8b+ 16b+ 32b+
No. of Reads: 4 8 7
No. of Writes: 2557 7882 15309
Block Size: 64b+ 128b+ 256b+
No. of Reads: 7 2 3
No. of Writes: 36962 45448 81464
Block Size: 512b+ 1024b+ 2048b+
No. of Reads: 8 6 6
No. of Writes: 172545 242644 248911
Block Size: 4096b+ 8192b+ 16384b+
No. of Reads: 5 3 0
No. of Writes: 328818 219934 113897
Block Size: 32768b+ 65536b+ 131072b+
No. of Reads: 1 0 134
No. of Writes: 30425 29940 3538
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
0.00 0.00 us 0.00 us 0.00 us 896765 FORGET
0.00 0.00 us 0.00 us 0.00 us 924635 RELEASE
0.00 0.00 us 0.00 us 0.00 us 339635 RELEASEDIR
0.00 13.37 us 13.37 us 13.37 us 1 IPC
0.00 66.69 us 53.10 us 83.91 us 3 OPEN
0.00 163.49 us 98.12 us 221.13 us 4 XATTROP
0.00 58.06 us 26.68 us 99.72 us 48 STAT
0.00 111.60 us 8.91 us 10140.95 us 475 GETXATTR
0.01 180.16 us 120.31 us 827.79 us 708 READDIR
0.02 50.21 us 19.04 us 317.89 us 4269 STATFS
0.58 57.68 us 1.15 us 2371.81 us 129210 OPENDIR
0.74 225.68 us 58.56 us 523496.35 us 42294 RMDIR
0.80 39.83 us 13.41 us 1725.14 us 257666 FSTAT
1.06 312.56 us 79.89 us 21462.77 us 43289 READDIRP
1.20 75.64 us 34.17 us 36849.81 us 202935 SETXATTR
1.29 854.18 us 79.47 us 1818051.47 us 19308 SYMLINK
1.85 29.93 us 8.26 us 3317.58 us 794395 FLUSH
2.33 294.38 us 100.64 us 1317507.92 us 101455 MKDIR
3.78 31.41 us 9.04 us 10312.45 us 1544860 FINODELK
6.06 74.47 us 23.47 us 107814.46 us 1044119 WRITE
7.01 85.65 us 28.44 us 349303.92 us 1049164 SETATTR
7.86 201.82 us 54.81 us 902938.47 us 499515 UNLINK
8.80 35.25 us 6.83 us 17269.82 us 3201467 ENTRYLK
8.99 122.18 us 12.96 us 3501.46 us 943281 LOOKUP
9.05 38.50 us 8.57 us 3244.38 us 3013980 INODELK
14.45 119.98 us 58.56 us 1015242.53 us 1544872 FXATTROP
24.11 389.23 us 88.05 us 3212752.83 us 794395 CREATE
0.00 0.00 us 0.00 us 0.00 us 760755 UPCALL
0.00 0.00 us 0.00 us 0.00 us 4 CI_IATT
0.00 0.00 us 0.00 us 0.00 us 760751 CI_FORGET
Duration: 75916 seconds
Data Read: 17721794 bytes
Data Written: 12043293853 bytes
Interval 54 Stats:
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
20.95 42.06 us 42.06 us 42.06 us 1 STATFS
21.97 44.10 us 44.10 us 44.10 us 1 STAT
24.05 48.29 us 48.29 us 48.29 us 1 INODELK
33.03 66.32 us 66.32 us 66.32 us 1 SETXATTR
Duration: 9 seconds
Data Read: 0 bytes
Data Written: 0 bytes
Brick: glusterserver-1009:/srv/gfs/test01/brk03/brick
-----------------------------------------------------
Cumulative Stats:
Block Size: 1b+ 2b+ 4b+
No. of Reads: 0 0 2
No. of Writes: 294 116 1096
Block Size: 8b+ 16b+ 32b+
No. of Reads: 3 6 6
No. of Writes: 2557 7882 15309
Block Size: 64b+ 128b+ 256b+
No. of Reads: 4 8 4
No. of Writes: 36962 45448 81464
Block Size: 512b+ 1024b+ 2048b+
No. of Reads: 2 5 10
No. of Writes: 172545 242644 248911
Block Size: 4096b+ 8192b+ 16384b+
No. of Reads: 8 5 2
No. of Writes: 328818 219934 113897
Block Size: 32768b+ 65536b+ 131072b+
No. of Reads: 1 0 0
No. of Writes: 30425 29940 3538
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
0.00 0.00 us 0.00 us 0.00 us 896824 FORGET
0.00 0.00 us 0.00 us 0.00 us 924635 RELEASE
0.00 0.00 us 0.00 us 0.00 us 339635 RELEASEDIR
0.00 10.74 us 10.74 us 10.74 us 1 IPC
0.00 161.89 us 161.89 us 161.89 us 1 READ
0.00 60.71 us 49.68 us 72.11 us 3 OPEN
0.00 133.84 us 101.36 us 221.90 us 4 XATTROP
0.00 55.63 us 22.18 us 114.32 us 33 STAT
0.00 107.00 us 8.75 us 10290.51 us 476 GETXATTR
0.01 183.07 us 118.05 us 814.73 us 708 READDIR
0.02 49.79 us 17.32 us 316.04 us 4269 STATFS
0.56 54.30 us 1.06 us 2465.17 us 129212 OPENDIR
0.75 220.71 us 59.96 us 362941.28 us 42295 RMDIR
0.81 39.33 us 13.09 us 2392.57 us 256957 FSTAT
1.15 338.92 us 83.33 us 19915.34 us 42509 READDIRP
1.22 74.94 us 34.92 us 16300.31 us 202935 SETXATTR
1.45 934.84 us 78.16 us 2381641.79 us 19308 SYMLINK
1.89 29.76 us 8.42 us 2740.79 us 794395 FLUSH
2.12 261.21 us 104.18 us 543133.73 us 101455 MKDIR
3.86 31.19 us 9.12 us 13211.42 us 1544860 FINODELK
6.13 73.29 us 24.00 us 16058.39 us 1044119 WRITE
6.98 83.07 us 27.11 us 20891.54 us 1049164 SETATTR
7.69 192.35 us 57.56 us 1016426.04 us 499525 UNLINK
8.90 34.72 us 7.26 us 34002.85 us 3201497 ENTRYLK
9.12 120.69 us 12.89 us 179277.82 us 943388 LOOKUP
9.20 38.10 us 8.54 us 3504.33 us 3014388 INODELK
14.47 116.99 us 57.00 us 74415.45 us 1544872 FXATTROP
23.67 372.07 us 86.98 us 2749754.20 us 794395 CREATE
0.00 0.00 us 0.00 us 0.00 us 760911 UPCALL
0.00 0.00 us 0.00 us 0.00 us 2 CI_IATT
0.00 0.00 us 0.00 us 0.00 us 760909 CI_FORGET
Duration: 75916 seconds
Data Read: 232191 bytes
Data Written: 12043293853 bytes
Interval 54 Stats:
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
22.83 44.61 us 44.61 us 44.61 us 1 INODELK
34.35 67.11 us 67.11 us 67.11 us 1 SETXATTR
42.81 83.64 us 83.64 us 83.64 us 1 STATFS
Duration: 9 seconds
Data Read: 0 bytes
Data Written: 0 bytes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: io-stats-pre-netmax.txt.test01
Type: application/octet-stream
Size: 34260 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180831/edc73771/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: io-stats-post-netmax.txt.test01
Type: application/octet-stream
Size: 36776 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180831/edc73771/attachment-0001.obj>
More information about the Gluster-users
mailing list