[Gluster-devel] Performance tuning for MySQL

Gordan Bobic gordan at bobich.net
Wed Feb 11 10:52:39 UTC 2009


On Wed, 11 Feb 2009 05:29:43 -0500, David Sickmiller
<david at careerliaison.com> wrote:
> Gordan Bobic wrote:
>> Note that DRBD resync is more efficient - it only resyncs dirty 
>> blocks, which in the case of big databases, can be much faster. 
>> Gluster will copy the whole file.
>
> Thanks for pointing that out; I'll have to think about that.  I had been 
> hoping GlusterFS did some sort of rsync equivalent, although even that 
> would still require reading the whole file locally.

This has been discussed before. You may want to have a search through the
archives. AFAIK, the rsync rolling hash method isn't implemented yet. It
just does a full file copy.

>> Did you flush the caches inbetween the tries? What is your network 
>> connection between the nodes?
>
> I attempted to prime the cache for each measurement.  I shut MySQL down 
> between tries, made the GlusterFS adjustments, restarted MySQL, and ran 
> the queries a few times before recording the stats.

Do:
echo 3 >/proc/sys/vm/drop_caches
before each measurement to achieve more consistent results.

> I think the connection is 100M Ethernet.  I'll have to double-check.  
> It's actually a Xen guest, so I'm a bit insulated.  The nodes are on 
> separate physical servers, though, on purpose.

If it's 100Mb, that is likely a major contributory factor to the slowdown
you are seeing, as are any bottlenecking buses. A modern disk is be able to
push 100+MB/s on reads, and more than half that on writes (and on small,
bursty writes you'll get bus speeds with write-caches and NCQ enabled). A
slowdown of 10x or more for going over the 100Mb network is inevitable.
Plus, you are adding around 25% latency to all I/O ops for the network
overheads, which can have a massive effect on throughput. That's whay
io-cache helps so much, it works around some of the latencies.

Also remember that virtualization comes with major performance penalties,
especially on I/O (including disk I/O), contrary to what the virtualization
technology vendors are saying.

>> What is the ping time between the servers? Have you measured the 
>> throughput between the servers with something like ftp on big files? 
>> Is it the writes or the reads that slow down? Try dumping to a ext3 
>> from gluster.
>
> Ping time is around 0.3ms.  I'll have to spend some time doing these 
> other tests.

0.3ms pings implies it might be a gigabit connection.

>>> Since I'm only going to have one server writing to the filesystem at 
>>> a time, I could mount it read-only (or not at all) on the other 
>>> server.  Would that mean I could safely set data-lock-server-count=0 
>>> and entry-lock-server-count=0 because I can be confident that there 
>>> won't be any conflicting writes?  I don't want to take unnecessary 
>>> risks, but it seems like unnecessary overhead for my use case.
>>
>> Hmm... If the 1st server fails, the lock server will fail to the next 
>> one, and you fire up MySQL there then. I thought you said it was only 
>> the 2nd server that suffers the penalty. Since the 2nd server will 
>> fail over locking from the 1st if the 1st fails, the performance 
>> should be the same after fail-over. You'll still have the active 
>> server being the lock server.
>
> The second server suffers a large penalty for having to lock on the 
> first server.

Sure it does, you have all the network bottlenecks and latencies hitting
the performance. But I thuoght you said the 2nd server isn't doing anything
until it is promoted, pending failure of the 1st server. So does this
matter?

> I wonder if the first server might still be bearing an 
> unnecessary cost for doing the locking, even though it's faster when 
> it's local.

I believe locks fail-over, but they don't fail-back. If server 1 goes away,
server 2 becomes the lock server for all the clients, and the locks
survive. But if then server 1 comes back and server 2 goes away, IIRC, the
locks don't fail-over back to the 1st server.

Anyway, the point is that if you list your primary server first in the
volume list, it'll be the lock server until it fails, which should take
care of your performance problem.

> Another wrinkle is that I'd rather have the servers be equal peers so 
> that I don't need to have MySQL fail back to server #1 as soon as it 
> comes back up.

You don't have to do that. See above. I think locks fail over up the chain,
but not down it.

> If I want server #2 to stay fast even after server #1 
> comes back up, I'd need to stop GlusterFS, reorder the volfile on both 
> servers, and restart it.  That seems somewhat difficult (particularly 
> changing the volfile on server #1 before it comes back up), and it would 
> be unnecessary if locking isn't adding any value.

If you want equal peers, then you are likely going to be an order of
magnitude better off with round-robin replication on a bare ext3 file
system. But if you want just fail-over, you can handle that with something
like RHCS. Fail over the floating live IP and fail over (forward or
backward, you can set priority target hosts for each service) the MySQL
service. You are going to have a blip in connectivity during the failover
either way unless you have a fully active-active setup, and if that's the
case, you might as well have the benefit of ext3 performance.

Gordan





More information about the Gluster-devel mailing list