[Gluster-users] Extremely slow file listing in folders with many files

Artem Russakovskii archon810 at gmail.com
Mon May 18 22:10:40 UTC 2020


Hi,

Does the gluster team have any feedback about this? Resolving the "Found
anomalies" issues may be key to resolving dir list speed issues.

Sincerely,
Artem

--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net | @ArtemR <http://twitter.com/ArtemR>


On Thu, Apr 30, 2020 at 10:36 PM Strahil Nikolov <hunter86_bg at yahoo.com>
wrote:

> On April 30, 2020 9:05:19 PM GMT+03:00, Artem Russakovskii <
> archon810 at gmail.com> wrote:
> >I did this on the same prod instance just now.
> >
> >'find' on a fuse gluster dir with 40k+ files:
> >1st run: 3m56.261s
> >2nd run: 0m24.970s
> >3rd run: 0m24.099s
> >
> >At this point, I killed all gluster services on one of the 4 servers
> >and
> >verified that that brick went offline.
> >
> >1st run: 0m38.131s
> >2nd run: 0m19.369s
> >3rd run: 0m23.576s
> >
> >Nothing conclusive really IMO.
> >
> >Sincerely,
> >Artem
> >
> >--
> >Founder, Android Police <http://www.androidpolice.com>, APK Mirror
> ><http://www.apkmirror.com/>, Illogical Robot LLC
> >beerpla.net | @ArtemR <http://twitter.com/ArtemR>
> >
> >
> >On Thu, Apr 30, 2020 at 9:55 AM Strahil Nikolov <hunter86_bg at yahoo.com>
> >wrote:
> >
> >> On April 30, 2020 6:27:10 PM GMT+03:00, Artem Russakovskii <
> >> archon810 at gmail.com> wrote:
> >> >Hi Strahil, in the original email I included both the times for the
> >> >first
> >> >and subsequent reads on the fuse mounted gluster volume as well as
> >the
> >> >xfs
> >> >filesystem the gluster data resides on (this is the brick, right?).
> >> >
> >> >On Thu, Apr 30, 2020, 7:44 AM Strahil Nikolov
> ><hunter86_bg at yahoo.com>
> >> >wrote:
> >> >
> >> >> On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii <
> >> >> archon810 at gmail.com> wrote:
> >> >> >Hi all,
> >> >> >
> >> >> >We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes,
> >and
> >> >the
> >> >> >10TB one especially is extremely slow to do certain things with
> >(and
> >> >> >has
> >> >> >been since gluster 3.x when we started). We're currently on 5.13.
> >> >> >
> >> >> >The number of files isn't even what I'd consider that great -
> >under
> >> >> >100k
> >> >> >per dir.
> >> >> >
> >> >> >Here are some numbers to look at:
> >> >> >
> >> >> >On gluster volume in a dir of 45k files:
> >> >> >The first time
> >> >> >
> >> >> >time find | wc -l
> >> >> >45423
> >> >> >real    8m44.819s
> >> >> >user    0m0.459s
> >> >> >sys     0m0.998s
> >> >> >
> >> >> >And again
> >> >> >
> >> >> >time find | wc -l
> >> >> >45423
> >> >> >real    0m34.677s
> >> >> >user    0m0.291s
> >> >> >sys     0m0.754s
> >> >> >
> >> >> >
> >> >> >If I run the same operation on the xfs block device itself:
> >> >> >The first time
> >> >> >
> >> >> >time find | wc -l
> >> >> >45423
> >> >> >real    0m13.514s
> >> >> >user    0m0.144s
> >> >> >sys     0m0.501s
> >> >> >
> >> >> >And again
> >> >> >
> >> >> >time find | wc -l
> >> >> >45423
> >> >> >real    0m0.197s
> >> >> >user    0m0.088s
> >> >> >sys     0m0.106s
> >> >> >
> >> >> >
> >> >> >I'd expect a performance difference here but just as it was
> >several
> >> >> >years
> >> >> >ago when we started with gluster, it's still huge, and simple
> >file
> >> >> >listings
> >> >> >are incredibly slow.
> >> >> >
> >> >> >At the time, the team was looking to do some optimizations, but
> >I'm
> >> >not
> >> >> >sure this has happened.
> >> >> >
> >> >> >What can we do to try to improve performance?
> >> >> >
> >> >> >Thank you.
> >> >> >
> >> >> >
> >> >> >
> >> >> >Some setup values follow.
> >> >> >
> >> >> >xfs_info /mnt/SNIP_block1
> >> >> >meta-data=/dev/sdc               isize=512    agcount=103,
> >> >> >agsize=26214400
> >> >> >blks
> >> >> >         =                       sectsz=512   attr=2,
> >projid32bit=1
> >> >> >      =                       crc=1        finobt=1, sparse=0,
> >> >rmapbt=0
> >> >> >         =                       reflink=0
> >> >> >data     =                       bsize=4096   blocks=2684354560,
> >> >> >imaxpct=25
> >> >> >         =                       sunit=0      swidth=0 blks
> >> >> >naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
> >> >> >log      =internal log           bsize=4096   blocks=51200,
> >> >version=2
> >> >> >        =                       sectsz=512   sunit=0 blks,
> >> >lazy-count=1
> >> >> >realtime =none                   extsz=4096   blocks=0,
> >rtextents=0
> >> >> >
> >> >> >Volume Name: SNIP_data1
> >> >> >Type: Replicate
> >> >> >Volume ID: SNIP
> >> >> >Status: Started
> >> >> >Snapshot Count: 0
> >> >> >Number of Bricks: 1 x 4 = 4
> >> >> >Transport-type: tcp
> >> >> >Bricks:
> >> >> >Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1
> >> >> >Brick2: forge:/mnt/SNIP_block1/SNIP_data1
> >> >> >Brick3: hive:/mnt/SNIP_block1/SNIP_data1
> >> >> >Brick4: citadel:/mnt/SNIP_block1/SNIP_data1
> >> >> >Options Reconfigured:
> >> >> >cluster.quorum-count: 1
> >> >> >cluster.quorum-type: fixed
> >> >> >network.ping-timeout: 5
> >> >> >network.remote-dio: enable
> >> >> >performance.rda-cache-limit: 256MB
> >> >> >performance.readdir-ahead: on
> >> >> >performance.parallel-readdir: on
> >> >> >network.inode-lru-limit: 500000
> >> >> >performance.md-cache-timeout: 600
> >> >> >performance.cache-invalidation: on
> >> >> >performance.stat-prefetch: on
> >> >> >features.cache-invalidation-timeout: 600
> >> >> >features.cache-invalidation: on
> >> >> >cluster.readdir-optimize: on
> >> >> >performance.io-thread-count: 32
> >> >> >server.event-threads: 4
> >> >> >client.event-threads: 4
> >> >> >performance.read-ahead: off
> >> >> >cluster.lookup-optimize: on
> >> >> >performance.cache-size: 1GB
> >> >> >cluster.self-heal-daemon: enable
> >> >> >transport.address-family: inet
> >> >> >nfs.disable: on
> >> >> >performance.client-io-threads: on
> >> >> >cluster.granular-entry-heal: enable
> >> >> >cluster.data-self-heal-algorithm: full
> >> >> >
> >> >> >Sincerely,
> >> >> >Artem
> >> >> >
> >> >> >--
> >> >> >Founder, Android Police <http://www.androidpolice.com>, APK
> >Mirror
> >> >> ><http://www.apkmirror.com/>, Illogical Robot LLC
> >> >> >beerpla.net | @ArtemR <http://twitter.com/ArtemR>
> >> >>
> >> >> Hi Artem,
> >> >>
> >> >> Have you checked the same on brick level ? How big is the
> >difference
> >> >?
> >> >>
> >> >> Best Regards,
> >> >> Strahil Nikolov
> >> >>
> >>
> >> Hi Artem,
> >>
> >> My bad I missed the 'xfs' word... Still the difference  is huge.
> >>
> >> May I ask you to do a test again (pure curiosity) as follows:
> >> 1. Repeat the test from before
> >> 2. Stop 1 brick  and test again.
> >>
> >>
> >> P.S.: You can try it on the test cluster
> >>
> >> Best Regards,
> >> Strahil Nikolov
> >>
>
> Hi Artem,
>
> I was wondering if the 4th replica  is adding additional overhead (another
> dir to check), but the test is not very conclusive.
>
>
> Actually the 'anomalities' log entries in your pool  could be a symptom of
> another pdoblem (just like the long listing time).
>
> I will try to reproduce your setup (smaller scale -  1  brick 50k files)
> and then will try with 3 bricks.
>
>
> Best Regards,
> Strahil Nikolov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200518/cccb0ef0/attachment.html>


More information about the Gluster-users mailing list