[Gluster-users] Improving Gluster performance through more hardware.

Thu May 7 23:06:43 UTC 2015

----- Original Message -----
> From: "Ernie Dunbar" <maillist at lightspeed.ca>
> To: "Gluster Users" <gluster-users at gluster.org>
> Sent: Thursday, May 7, 2015 2:36:08 PM
> Subject: [Gluster-users] Improving Gluster performance through more hardware.
> 
> Hi all.
> 
> First, I have a specific question about what hardware should be used for
> Gluster, then after that I have a question about how Gluster does its
> multithreading/hyperthreading.
> 
> So, we have a new Gluster cluster (currently, two servers with one
> "replicated" volume) serving up our files for e-mail, which has for
> years been stored in Maildir format. That works pretty well except for
> the few clients who store all their old mail on our server, and their
> "cur" folder contains a few tens of thousands of messages. As others
> have noticed, this isn't something that Gluster handles well. But we
> value high availability and redundancy more than we value fast, and we
> don't yet have a large enough cluster to justify going with software the
> requires a metadata server. So we're going with Gluster as a result of
> this. That doesn't mean we don't need better performance though.
> 
> So I've noticed that the resources that Gluster consumes the most in our
> use case isn't the network or disk utilization - both of which remain
> *well* under full utilization - but CPU cycles. I can easily test this
> by running `ls -l` in a folder with ~20,000 files in it, and I see CPU
> usage by glusterfsd jump to between 40-200%. The glusterfs process
> usually stays around 20-30%.
> 
> Both of our Gluster servers are gen III Dell 2950's with dual Xeon
> E5345's (quad-core, 2.33 GHz CPUs) in them, so we have 8 CPUs total to
> deal with this load. So far, we're only using a single mail server, but
> we'll be migrating to a load-balanced pair very soon. So my guess is
> that we can reduce the latency that's very noticeable in our webmail by
> upgrading to the fastest CPUs the 2950's can hold, evidently a 3.67 GHz
> quad-core.
> 
> It would be nice to know what other users have experienced with this
> kind of upgrade, or whether they've gotten better performance from other
> hardware upgrades.
> 
> Which leads to my second question. Does glusterfsd spawn multiple
> threads to handle other requests made of it? I don't see any evidence of
> this in the `top` program, but other clients don't notice at all that
> I'm running up the CPU usage with my one `ls` process. Smaller mail
> accounts can read their mail just as quickly as if the system were at
> near-idle while this operation is in progress. It's also hard for me to
> test this with only one mail server attached to the Gluster cluster. I
> can't tell if the additional load from 20 or 100 other servers makes any
> difference to CPU usage, but we want to know about what performance we
> can expect should we expand that far, and whether throwing more CPUs at
> the problem is the answer, or just throwing faster CPUs at the problem
> is what we will need to do in the future.

Alot of what you are seeing is getting addressed with:

http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf

Specifically:

http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf#multi-thread-epoll

In the past the single event listener thread would peg out a CPU(hot thread) and until MT epoll throwing more CPUs at the problem wouldn't help much:

"Previously, epoll thread did socket even-handling and the same thread was used for serving the client or processing the response received from the server. Due to this, other requests were in a queue until the current epoll thread completed its operation. With multi-threaded epoll, events are distributed that improves the performance due the parallel processing of requests/responses received."

Here are the guidelines for tuning them:

https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/Small_File_Performance_Enhancements.html

Server and client event threads are available in 3.7, and more improvements are in the pipe.  I would start with 4 of each and do some tuning to see what fits your workload best.

I just ran a test where I created ~300 GB with of 64k files.  On 3.7 beta I got:

4917.50 files / second across 4 clients mounting a 2x2 dist rep volume.

The same test on 3.6 + same HW:

2069.28 files / second across 4 clients mounting a 2x2 dist rep volume.

I was running smallfile:

http://www.gluster.org/community/documentation/index.php/Performance_Testing#smallfile_Distributed_I.2FO_Benchmark

To confirm you are hitting the hot thread I suggest running the benchamark of your choice(I like smallfile for this) and on the brick servers hit:

# top -H

If you see one of the gluster threads at 100% CPU then you are probably hitting the hot event thread issue that MT epoll addresses.  Here is what my top -H list looks like during a smallfile run:

Tasks: 640 total,   3 running, 637 sleeping,   0 stopped,   0 zombie
Cpu(s): 12.8%us, 11.1%sy,  0.0%ni, 74.3%id,  0.0%wa,  0.0%hi,  1.7%si,  0.0%st
Mem:  49544600k total,  5809544k used, 43735056k free,     6344k buffers
Swap: 24772604k total,        0k used, 24772604k free,  4380832k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                               
 3278 root      20   0 2005m  90m 4252 R 65.4  0.2   4:45.85 glusterfsd                                                                                                                                            
 4155 root      20   0 2005m  90m 4252 S 64.0  0.2   4:32.96 glusterfsd                                                                                                                                            
 4156 root      20   0 2005m  90m 4252 R 64.0  0.2   4:19.60 glusterfsd                                                                                                                                            
 3277 root      20   0 2005m  90m 4252 S 63.7  0.2   4:45.19 glusterfsd                                                                                                                                            
 4224 root      20   0 2005m  90m 4252 S 26.7  0.2   1:54.49 glusterfsd                                                                                                                                            
 6106 root      20   0 2005m  90m 4252 S 26.4  0.2   0:46.62 glusterfsd                                                                                                                                            
 4194 root      20   0 2005m  90m 4252 S 25.4  0.2   1:58.92 glusterfsd                                                                                                                                            
 4222 root      20   0 2005m  90m 4252 S 25.4  0.2   1:53.72 glusterfsd                                                                                                                                            
 4051 root      20   0 2005m  90m 4252 S 24.4  0.2   2:08.99 glusterfsd                                                                                                                                            
 3647 root      20   0 2005m  90m 4252 S 24.1  0.2   2:07.82 glusterfsd                                                                                                                                            
 3280 root      20   0 2005m  90m 4252 S 23.4  0.2   2:13.00 glusterfsd                                                                                                                                            
 4223 root      20   0 2005m  90m 4252 S 23.1  0.2   1:53.21 glusterfsd                                                                                                                                            
 4227 root      20   0 2005m  90m 4252 S 23.1  0.2   1:54.60 glusterfsd                                                                                                                                            
 4226 root      20   0 2005m  90m 4252 S 22.4  0.2   1:54.64 glusterfsd                                                                                                                                            
 6107 root      20   0 2005m  90m 4252 S 22.1  0.2   0:46.16 glusterfsd                                                                                                                                            
 6108 root      20   0 2005m  90m 4252 S 22.1  0.2   0:46.07 glusterfsd                                                                                                                                            
 4052 root      20   0 2005m  90m 4252 S 21.5  0.2   2:08.35 glusterfsd                                                                                                                                            
 4053 root      20   0 2005m  90m 4252 S 21.1  0.2   2:08.40 glusterfsd                                                                                                                                            
 4195 root      20   0 2005m  90m 4252 S 20.8  0.2   1:58.29 glusterfsd                                                                                                                                            
 4225 root      20   0 2005m  90m 4252 S 20.5  0.2   1:53.36 glusterfsd                                                                                                                                            
 3286 root      20   0 2005m  90m 4252 S  7.9  0.2   0:43.18 glusterfsd                                                                                                                                            
 2817 root      20   0     0    0    0 S  1.3  0.0   0:02.18 xfslogd/1                                                                                                                                             
 2757 root      20   0     0    0    0 S  0.7  0.0   0:42.58 dm-thin                                                                                                                                               
 2937 root      20   0     0    0    0 S  0.7  0.0   0:01.80 xfs-cil/dm-6                                                                                                                                          
10039 root      20   0 15536 1692  932 R  0.7  0.0   0:00.17 top                                                                                                                                                   
  155 root      20   0     0    0    0 S  0.3  0.0   0:00.27 kblockd/1                                                                                                                                             
 3283 root      20   0 2005m  90m 4252 S  0.3  0.2   0:00.08 glusterfsd                                                                                                                                            
    1 root      20   0 19356 1472 1152 S  0.0  0.0   0:03.08 init  

HTH!

-b

> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>