[Gluster-users] Improving Gluster performance through more hardware.
Ernie Dunbar
maillist at lightspeed.ca
Thu May 7 18:36:08 UTC 2015
Hi all.
First, I have a specific question about what hardware should be used for
Gluster, then after that I have a question about how Gluster does its
multithreading/hyperthreading.
So, we have a new Gluster cluster (currently, two servers with one
"replicated" volume) serving up our files for e-mail, which has for
years been stored in Maildir format. That works pretty well except for
the few clients who store all their old mail on our server, and their
"cur" folder contains a few tens of thousands of messages. As others
have noticed, this isn't something that Gluster handles well. But we
value high availability and redundancy more than we value fast, and we
don't yet have a large enough cluster to justify going with software the
requires a metadata server. So we're going with Gluster as a result of
this. That doesn't mean we don't need better performance though.
So I've noticed that the resources that Gluster consumes the most in our
use case isn't the network or disk utilization - both of which remain
*well* under full utilization - but CPU cycles. I can easily test this
by running `ls -l` in a folder with ~20,000 files in it, and I see CPU
usage by glusterfsd jump to between 40-200%. The glusterfs process
usually stays around 20-30%.
Both of our Gluster servers are gen III Dell 2950's with dual Xeon
E5345's (quad-core, 2.33 GHz CPUs) in them, so we have 8 CPUs total to
deal with this load. So far, we're only using a single mail server, but
we'll be migrating to a load-balanced pair very soon. So my guess is
that we can reduce the latency that's very noticeable in our webmail by
upgrading to the fastest CPUs the 2950's can hold, evidently a 3.67 GHz
quad-core.
It would be nice to know what other users have experienced with this
kind of upgrade, or whether they've gotten better performance from other
hardware upgrades.
Which leads to my second question. Does glusterfsd spawn multiple
threads to handle other requests made of it? I don't see any evidence of
this in the `top` program, but other clients don't notice at all that
I'm running up the CPU usage with my one `ls` process. Smaller mail
accounts can read their mail just as quickly as if the system were at
near-idle while this operation is in progress. It's also hard for me to
test this with only one mail server attached to the Gluster cluster. I
can't tell if the additional load from 20 or 100 other servers makes any
difference to CPU usage, but we want to know about what performance we
can expect should we expand that far, and whether throwing more CPUs at
the problem is the answer, or just throwing faster CPUs at the problem
is what we will need to do in the future.
More information about the Gluster-users
mailing list