[Gluster-users] Extremely slow file listing in folders with many files

Artem Russakovskii archon810 at gmail.com
Thu Apr 30 18:02:00 UTC 2020


One thing I noticed (and reported in another email to the mailing list) is
that when the really slow dir lists happen the first time I ls, the log
file is filled with hundreds or even thousands of messages like these:
[2020-04-30 17:49:04.844167] I [MSGID: 109063]
[dht-layout.c:659:dht_layout_normalize] 0-SNIP_data1-dht: Found anomalies
in (null) (gfid = c86e39a1-32ef-4eaf-b5a7-a90d73239c5a). Holes=1 overlaps=0

The subsequent ls is much faster, and there are no such log messages.

Could whatever is causing those contribute to the massive slowdown?

Although even the subsequent ls of 20-40s is still several orders of
magnitude slower than ls on the xfs brick itself, which takes only ~0.2s.

Sincerely,
Artem

--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net | @ArtemR <http://twitter.com/ArtemR>


On Thu, Apr 30, 2020 at 10:54 AM Artem Russakovskii <archon810 at gmail.com>
wrote:

> getfattr -d -m. -e hex .
> # file: .
> trusted.afr.SNIP_data1-client-0=0x000000000000000000000000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x44b2db00267a47508b2a8a921f20e0f5
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.dht.mds=0x00000000
>
> Sincerely,
> Artem
>
> --
> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
> <http://www.apkmirror.com/>, Illogical Robot LLC
> beerpla.net | @ArtemR <http://twitter.com/ArtemR>
>
>
> On Thu, Apr 30, 2020 at 9:05 AM Felix Kölzow <felix.koelzow at gmx.de> wrote:
>
>> Dear Artem,
>>
>> sry for the noise, since you already provide the xfs_info.
>>
>> Could you provide the output of
>>
>>
>> getfattr -d -m. -e hex /DirectoryPathOfInterest_onTheBrick/
>>
>>
>> Felix
>>
>> On 30/04/2020 18:01, Felix Kölzow wrote:
>>
>> Dear Artem,
>>
>> can you also provide some information w.r.t your xfs filesystem, i.e.
>> xfs_info of your block device?
>>
>>
>> Regards,
>>
>> Felix
>> On 30/04/2020 17:27, Artem Russakovskii wrote:
>>
>> Hi Strahil, in the original email I included both the times for the first
>> and subsequent reads on the fuse mounted gluster volume as well as the xfs
>> filesystem the gluster data resides on (this is the brick, right?).
>>
>> On Thu, Apr 30, 2020, 7:44 AM Strahil Nikolov <hunter86_bg at yahoo.com>
>> wrote:
>>
>>> On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii <
>>> archon810 at gmail.com> wrote:
>>> >Hi all,
>>> >
>>> >We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, and the
>>> >10TB one especially is extremely slow to do certain things with (and
>>> >has
>>> >been since gluster 3.x when we started). We're currently on 5.13.
>>> >
>>> >The number of files isn't even what I'd consider that great - under
>>> >100k
>>> >per dir.
>>> >
>>> >Here are some numbers to look at:
>>> >
>>> >On gluster volume in a dir of 45k files:
>>> >The first time
>>> >
>>> >time find | wc -l
>>> >45423
>>> >real    8m44.819s
>>> >user    0m0.459s
>>> >sys     0m0.998s
>>> >
>>> >And again
>>> >
>>> >time find | wc -l
>>> >45423
>>> >real    0m34.677s
>>> >user    0m0.291s
>>> >sys     0m0.754s
>>> >
>>> >
>>> >If I run the same operation on the xfs block device itself:
>>> >The first time
>>> >
>>> >time find | wc -l
>>> >45423
>>> >real    0m13.514s
>>> >user    0m0.144s
>>> >sys     0m0.501s
>>> >
>>> >And again
>>> >
>>> >time find | wc -l
>>> >45423
>>> >real    0m0.197s
>>> >user    0m0.088s
>>> >sys     0m0.106s
>>> >
>>> >
>>> >I'd expect a performance difference here but just as it was several
>>> >years
>>> >ago when we started with gluster, it's still huge, and simple file
>>> >listings
>>> >are incredibly slow.
>>> >
>>> >At the time, the team was looking to do some optimizations, but I'm not
>>> >sure this has happened.
>>> >
>>> >What can we do to try to improve performance?
>>> >
>>> >Thank you.
>>> >
>>> >
>>> >
>>> >Some setup values follow.
>>> >
>>> >xfs_info /mnt/SNIP_block1
>>> >meta-data=/dev/sdc               isize=512    agcount=103,
>>> >agsize=26214400
>>> >blks
>>> >         =                       sectsz=512   attr=2, projid32bit=1
>>> >      =                       crc=1        finobt=1, sparse=0, rmapbt=0
>>> >         =                       reflink=0
>>> >data     =                       bsize=4096   blocks=2684354560,
>>> >imaxpct=25
>>> >         =                       sunit=0      swidth=0 blks
>>> >naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
>>> >log      =internal log           bsize=4096   blocks=51200, version=2
>>> >        =                       sectsz=512   sunit=0 blks, lazy-count=1
>>> >realtime =none                   extsz=4096   blocks=0, rtextents=0
>>> >
>>> >Volume Name: SNIP_data1
>>> >Type: Replicate
>>> >Volume ID: SNIP
>>> >Status: Started
>>> >Snapshot Count: 0
>>> >Number of Bricks: 1 x 4 = 4
>>> >Transport-type: tcp
>>> >Bricks:
>>> >Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1
>>> >Brick2: forge:/mnt/SNIP_block1/SNIP_data1
>>> >Brick3: hive:/mnt/SNIP_block1/SNIP_data1
>>> >Brick4: citadel:/mnt/SNIP_block1/SNIP_data1
>>> >Options Reconfigured:
>>> >cluster.quorum-count: 1
>>> >cluster.quorum-type: fixed
>>> >network.ping-timeout: 5
>>> >network.remote-dio: enable
>>> >performance.rda-cache-limit: 256MB
>>> >performance.readdir-ahead: on
>>> >performance.parallel-readdir: on
>>> >network.inode-lru-limit: 500000
>>> >performance.md-cache-timeout: 600
>>> >performance.cache-invalidation: on
>>> >performance.stat-prefetch: on
>>> >features.cache-invalidation-timeout: 600
>>> >features.cache-invalidation: on
>>> >cluster.readdir-optimize: on
>>> >performance.io-thread-count: 32
>>> >server.event-threads: 4
>>> >client.event-threads: 4
>>> >performance.read-ahead: off
>>> >cluster.lookup-optimize: on
>>> >performance.cache-size: 1GB
>>> >cluster.self-heal-daemon: enable
>>> >transport.address-family: inet
>>> >nfs.disable: on
>>> >performance.client-io-threads: on
>>> >cluster.granular-entry-heal: enable
>>> >cluster.data-self-heal-algorithm: full
>>> >
>>> >Sincerely,
>>> >Artem
>>> >
>>> >--
>>> >Founder, Android Police <http://www.androidpolice.com>, APK Mirror
>>> ><http://www.apkmirror.com/>, Illogical Robot LLC
>>> >beerpla.net | @ArtemR <http://twitter.com/ArtemR>
>>>
>>> Hi Artem,
>>>
>>> Have you checked the same on brick level ? How big is the difference ?
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200430/a46c74a6/attachment.html>


More information about the Gluster-users mailing list