[Gluster-users] Finding my bottle neck

Renaud Fortier Renaud.Fortier at fsaa.ulaval.ca
Wed Dec 19 14:19:34 UTC 2018


Our workload are mostly read of small files that doesn't change much overtime (web) than, our the better improvement was the use of NFS-Ganesha. The caching of NFS-Ganesha helps us a lots in getting confortable performance. The first read is slow but subsequent reads are fast. You could give a try if it's not already done.

Renaud


-----Message d'origine-----
De : gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] De la part de Diego Remolina
Envoyé : 19 décembre 2018 08:22
À : gluster-users at gluster.org List <gluster-users at gluster.org>
Objet : Re: [Gluster-users] Finding my bottle neck

You have encountered a (if not "the") major flaw in glusterfs, it is not very good at dealing with lots of small files.

There are some tunables in gluster that may help just a bit, but you will *not* get the same speeds as raw direct attached storage without clustering or be even close to it. IIRC, this is because you will have to stat the files on each of the bricks and this adds latency.

SSDs will help some, but *not* dramatically as the major slow down is in checking all bricks.

What version of Gluster are you using?

If you really need great small file performance, I recommend you look elsewhere (listed in order of performance):

1. drbd so far has been the best in terms of small file performance in my tests, however it is more complex to setup than gluster and MooseFS. DRBD did not support more than 2 servers until version 9 and have recently changed their management system. May be a steep learning curve. DRBD was the best performing option in my tests by far.

2. MooseFS/LizardFS: I have been playing with MooseFS and find it much better than Glusterfs for dealing with lots of small files. It is as close as easy to setup as gluster vs the higher complexity of DRBD.
However, their stable release 3.x series does not have free HA (i.e.
automated failover with multiple masters). If you want to have HA/failover, then you have to purchase their "pro" edition (no idea on pricing). They had said that version 4.x would be released this fall
2018 and the free edition would have the free HA component, but it has not yet been released, so unless you are willing to pay for 3 Pro, you need to go elsewhere.

------------------------------

If you decide to stick with gluster, you can try some of the small file performance optimization changes, it will improve a bit, but will not be as good as DRBD nor MooseFS in my experience:

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/small_file_performance_enhancements
Some Small file performance options available since gluster 3.9:
https://stackoverflow.com/questions/42343391/how-can-i-improve-glusterfs-performance-with-small-files

HTH,

Diego

On Tue, Dec 18, 2018 at 10:36 PM csirotic <csirotic at gmail.com> wrote:
>
> Hi,
> I am new to using gluster and I am running some tests right now. I am fairly inexperienced as well, so it's a good learning experience for me.
>
> My problem right now is the small file create iops, using smallfile. I cannot get more than 800 files/second 4k.
>
> My setup is fairly simple.
> I have 4 servers.
> 3 first server have each one brick that is three way replicated.
> Server 4 simply mount the volume using the fuse native client.
>
> The first three servers, all have the same hardware. Its common supermicro servers with a raid 6 array of 8 x 6tb hgst 7200 drives.
> If I test smallfile directly on the brick location, I get very high results.
>
> For the networking part of it, the 4 servers are using 10GBytes. Iperf3 give me steady 10GBytes when I test between all the servers.
>
> When I transfer files from the client-server with the fuse mount, large .qcow files, I get around 150 MB/s. Why is not low, but is not great either.
>
> What would you look at first?
> The options that I am pondering are buying ssd drives to put cache on each servers.
> Also, it seems to me that having only a 3 way replication, instead of 2+2 setup, is really hurting.
> Any other tests that could help my process?
>
> Any input is much appreciated.
> Thank you.
>
>
>
>
> Sent from my Bell Samsung device over Canada's largest network.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list