[Gluster-users] Bad perf for small files on large EC volume

Pranith Kumar Karampuri pkarampu at redhat.com
Thu May 18 04:32:22 UTC 2017


On Tue, May 9, 2017 at 12:57 PM, Ingard Mevåg <ingard at jotta.no> wrote:

> You're not counting wrong. We won't necessarily transfer all of these
> files to one volume though. It was more an example of the distribution of
> file sizes.
> But as you say healing might be a problem, but then again. This is archive
> storage. We're after the highest possible capacity, not necessarily
> performance.
> If you take a look at the profile output you'll see that MKDIR, CREATE and
> XATTROP are the operations with the highest latency and I guess that is due
> to the number of bricks? ( 180 )
> But I thought that number wouldnt be too high to get at least a little bit
> higher troughput?
>

MKDIR is taking a long time most likely because brick process is taking
long to execute the syscall. You may have to figure out why that is the
case. There are healing enhancements planned to slowly increase performance
release by release. We will take a note of this one. Thanks for the inputs.

In our labs we use "strace -ff -T -p <pid-of-brick> -o
<path-to-the-file-where-you-want-the-output-saved>" to gather data about
time it takes for doing syscalls and inspect why some syscalls are taking
so much time. In most cases we find that the FS is configured wrong or
something is wrong with the disk. Please note that this slows down things
really bad, but it always found the reason for the problem so far. Since
this is in production I would find the exact test that would slow things
down, then do this strace only for the duration which recreates the problem
and stop strace after collecting the data.


> ingard
>
> 2017-05-08 15:19 GMT+02:00 Serkan Çoban <cobanserkan at gmail.com>:
>
>> There are 300M files right I am not counting wrong?
>> With that file profile I would never use EC in first place.
>> Maybe you can pack the files into tar archives or similar before
>> migrating to gluster?
>> It will take ages to heal a drive with that file count...
>>
>> On Mon, May 8, 2017 at 3:59 PM, Ingard Mevåg <ingard at jotta.no> wrote:
>> > With attachments :)
>> >
>> > 2017-05-08 14:57 GMT+02:00 Ingard Mevåg <ingard at jotta.no>:
>> >>
>> >> Hi
>> >>
>> >> We've got 3 servers with 60 drives each setup with an EC volume
>> running on
>> >> gluster 3.10.0
>> >> The servers are connected via 10gigE.
>> >>
>> >> We've done the changes recommended here :
>> >> https://bugzilla.redhat.com/show_bug.cgi?id=1349953#c17 and we're
>> able to
>> >> max out the network with the iozone tests referenced in the same
>> ticket.
>> >>
>> >> However for small files we are getting 3-5 MB/s with the
>> smallfile_cli.py
>> >> tool. For instance:
>> >> python smallfile_cli.py --operation create --threads 32 --file-size 100
>> >> --files 1000 --top /tmp/dfs-archive-001/
>> >> .
>> >> .
>> >> total threads = 32
>> >> total files = 31294
>> >> total data =     2.984 GB
>> >>  97.79% of requested files processed, minimum is  90.00
>> >> 785.542908 sec elapsed time
>> >> 39.837416 files/sec
>> >> 39.837416 IOPS
>> >> 3.890373 MB/sec
>> >> .
>> >>
>> >> We're going to use these servers for archive purposes, so the files
>> will
>> >> be moved there and accessed very little. After noticing our migration
>> tool
>> >> performing very badly we did some analyses on the data actually being
>> moved
>> >> :
>> >>
>> >> Bucket 31808791 (16.27 GB) :: 0 bytes - 1.00 KB
>> >> Bucket 49448258 (122.89 GB) :: 1.00 KB - 5.00 KB
>> >> Bucket 13382242 (96.92 GB) :: 5.00 KB - 10.00 KB
>> >> Bucket 13557684 (195.15 GB) :: 10.00 KB - 20.00 KB
>> >> Bucket 22735245 (764.96 GB) :: 20.00 KB - 50.00 KB
>> >> Bucket 15101878 (1041.56 GB) :: 50.00 KB - 100.00 KB
>> >> Bucket 10734103 (1558.35 GB) :: 100.00 KB - 200.00 KB
>> >> Bucket 17695285 (5773.74 GB) :: 200.00 KB - 500.00 KB
>> >> Bucket 13632394 (10039.92 GB) :: 500.00 KB - 1.00 MB
>> >> Bucket 21815815 (32641.81 GB) :: 1.00 MB - 2.00 MB
>> >> Bucket 36940815 (117683.33 GB) :: 2.00 MB - 5.00 MB
>> >> Bucket 13580667 (91899.10 GB) :: 5.00 MB - 10.00 MB
>> >> Bucket 10945768 (232316.33 GB) :: 10.00 MB - 50.00 MB
>> >> Bucket 1723848 (542581.89 GB) :: 50.00 MB - 9223372036.85 GB
>> >>
>> >> So it turns out we've got a very large number of very small files being
>> >> written to this volume.
>> >> I've attached the volume config and 2 profiling runs so if someone
>> wants
>> >> to take a look and maybe give us some hints in terms of what volume
>> settings
>> >> will be best for writing a lot of small files that would be much
>> >> appreciated.
>> >>
>> >> kind regards
>> >> ingard
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Gluster-users mailing list
>> > Gluster-users at gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170518/26fa15c4/attachment.html>


More information about the Gluster-users mailing list