[Gluster-users] Gluster 2.0.3 + Apache on CentOS5 performance issue

Sun Jul 12 12:11:16 UTC 2009

Hello,

We have been evaluating the choice for the new platform for a webboard
system.
The webboard is PHP scripts that generate/modify HTML page when user
posting/add comment to the page, resulting topic is actually stored as a
HTML file with all related file (file attach to the topic, etc.. )stored in
its own directory for each topic. In general, the web site mostly serve a
lot of small static files using Apache while using PHP to do other dynamic
contents. This system has been working very well in the past, with the
increasing page view rate, it is very likely that we will need some kind of
Cluster file system as backend very soon.

We have set up a test system using Grinder as stress test tool. The test
system is 11 machines of Intel Dual Core x86_64 CentOS5 with stock Apache
(prefork, since the goal is to use this with PHP), linked together with
Gigabit Ethernet. We try to compare the performance of either using single
NFS server in sync mode against using 4 Gluster nodes (distribute of 2
replicated nodes) through Fuse. However, the transaction per second (TPS)
result is not good.

NFS (single server, sync mode)
 - 100 thread of client - Peak TPS = 1716.67, Avg. TPS = 1066, mean response
time = 61.63 ms
 - 200 threads - Peak TPS = 2790, Avg. TPS = 1716, mean rt = 87.33 ms
 - 400 threads - Peak TPS = 3810, Avg. TPS = 1800, mean rt = 165ms
 - 600 threads - Peak TPS = 4506.67, Avg. TPS = 1676.67, mean rt = 287.33ms

4 nodes Gluster (2 distribute of replicated 2 node)
- 100 thread - peak TPS = 1293.33, Avg. TPS = 430, mean rt = 207.33ms
- 200 threads - Peak TPS = 974.67, Avg. TPS = 245.33, mean rt = 672.67ms
- 300 threads - Peak TPS = 861.33, Avg. TPS = 210, mean rt = 931.33
(no 400-600 threads since we run out of client machine, sorry).

gfsd is configured to use 32 thread of iothread as brick. gfs-client is
configured to use io-cache->write-behind->readahead->distribute->replicate.
io-cache cache-size is 256MB. I used patched Fuse downloaded from Gluster
web-site (build through DKMS).

As the result yield, it seems that Gluster performance worse with increasing
no. of client. One observation is that the glusterfs process on client is
taking about 100% of CPU during all the tests. glusterfsd is utilizing only
70-80% of CPUs during the test time. Note that system is Dual core.

I also tried using modglusterfs and not using fuse at all to serve all the
static files and conduct another test with Grinder. The result is about the
same, 1000+ peak TPS with 2-400 avg. TPS. A problem arise in this test that
each Apache prefork process used more about twice more memory and we need to
lower number of httpd processes by about half.

I tried disable EnableMMAP and it didn't help much. Adjusting readahead,
write behind according to GlusterOptimization page didn't help much either.

My question is, there seems to be bottleneck in this setup, but how can I
track this? Note that, I didn't do any other optimization other than what
said above. Are there any best practice configuration for using Apache to
serve a bunch of small static files like this around?

Regards,

Somsak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090712/b590cae7/attachment.html>