[Gluster-users] Bad perf for small files on large EC volume

Ingard Mevåg ingard at jotta.no
Tue May 9 07:27:15 UTC 2017


You're not counting wrong. We won't necessarily transfer all of these files
to one volume though. It was more an example of the distribution of file
sizes.
But as you say healing might be a problem, but then again. This is archive
storage. We're after the highest possible capacity, not necessarily
performance.
If you take a look at the profile output you'll see that MKDIR, CREATE and
XATTROP are the operations with the highest latency and I guess that is due
to the number of bricks? ( 180 )
But I thought that number wouldnt be too high to get at least a little bit
higher troughput?

ingard

2017-05-08 15:19 GMT+02:00 Serkan Çoban <cobanserkan at gmail.com>:

> There are 300M files right I am not counting wrong?
> With that file profile I would never use EC in first place.
> Maybe you can pack the files into tar archives or similar before
> migrating to gluster?
> It will take ages to heal a drive with that file count...
>
> On Mon, May 8, 2017 at 3:59 PM, Ingard Mevåg <ingard at jotta.no> wrote:
> > With attachments :)
> >
> > 2017-05-08 14:57 GMT+02:00 Ingard Mevåg <ingard at jotta.no>:
> >>
> >> Hi
> >>
> >> We've got 3 servers with 60 drives each setup with an EC volume running
> on
> >> gluster 3.10.0
> >> The servers are connected via 10gigE.
> >>
> >> We've done the changes recommended here :
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1349953#c17 and we're able
> to
> >> max out the network with the iozone tests referenced in the same ticket.
> >>
> >> However for small files we are getting 3-5 MB/s with the
> smallfile_cli.py
> >> tool. For instance:
> >> python smallfile_cli.py --operation create --threads 32 --file-size 100
> >> --files 1000 --top /tmp/dfs-archive-001/
> >> .
> >> .
> >> total threads = 32
> >> total files = 31294
> >> total data =     2.984 GB
> >>  97.79% of requested files processed, minimum is  90.00
> >> 785.542908 sec elapsed time
> >> 39.837416 files/sec
> >> 39.837416 IOPS
> >> 3.890373 MB/sec
> >> .
> >>
> >> We're going to use these servers for archive purposes, so the files will
> >> be moved there and accessed very little. After noticing our migration
> tool
> >> performing very badly we did some analyses on the data actually being
> moved
> >> :
> >>
> >> Bucket 31808791 (16.27 GB) :: 0 bytes - 1.00 KB
> >> Bucket 49448258 (122.89 GB) :: 1.00 KB - 5.00 KB
> >> Bucket 13382242 (96.92 GB) :: 5.00 KB - 10.00 KB
> >> Bucket 13557684 (195.15 GB) :: 10.00 KB - 20.00 KB
> >> Bucket 22735245 (764.96 GB) :: 20.00 KB - 50.00 KB
> >> Bucket 15101878 (1041.56 GB) :: 50.00 KB - 100.00 KB
> >> Bucket 10734103 (1558.35 GB) :: 100.00 KB - 200.00 KB
> >> Bucket 17695285 (5773.74 GB) :: 200.00 KB - 500.00 KB
> >> Bucket 13632394 (10039.92 GB) :: 500.00 KB - 1.00 MB
> >> Bucket 21815815 (32641.81 GB) :: 1.00 MB - 2.00 MB
> >> Bucket 36940815 (117683.33 GB) :: 2.00 MB - 5.00 MB
> >> Bucket 13580667 (91899.10 GB) :: 5.00 MB - 10.00 MB
> >> Bucket 10945768 (232316.33 GB) :: 10.00 MB - 50.00 MB
> >> Bucket 1723848 (542581.89 GB) :: 50.00 MB - 9223372036.85 GB
> >>
> >> So it turns out we've got a very large number of very small files being
> >> written to this volume.
> >> I've attached the volume config and 2 profiling runs so if someone wants
> >> to take a look and maybe give us some hints in terms of what volume
> settings
> >> will be best for writing a lot of small files that would be much
> >> appreciated.
> >>
> >> kind regards
> >> ingard
> >
> >
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170509/68077b57/attachment.html>


More information about the Gluster-users mailing list