[Gluster-users] Problem stressing an Apache Web server on GlusterFS volume

PANICHI MASSIMILIANO massimiliano.panichi at infogroup.it
Thu May 3 15:28:56 UTC 2012


Hi,

we are testing an Apache Web server on a GFS volume. Our goal is to buil an HA Reverse Proxy with pacemaker and GFS.

We installed and configured four GFS nodes with distributed replica, each node with 10GB of storage. So we configured a distributed replica storage for 20GB of disk space. We configured an Apache web server using the GFS volume for saving logs (access_log and error_log).

From another server we stressed the apache server calling a mod_perl script  printing an HTML page with apache environment variables. We used httperf to stress the web server. So, we haven't load performance problems but when we tested the availability of GFS we faced with a volume mounting hang from the client. We tested rebooting one of the GFS nodes and when this happens the client hang in writing on the GFS volume and df doesn't respond.

We used VM on VMWARE and all server are running Oracle Enterprise Linux 6.2 64bit with GlusterFS 3.2.6 (recompiled).

The mount from the apache server

mount -t glusterfs gfs01-dev:/VOLUME01 /opt/VOLUME01/

The files on all GFS nodes

---files on node 1----
4       /VOLUME01/GFS01-DEV_1/proxy_logs/logs/error_log
33152   /VOLUME01/GFS01-DEV_1/proxy_logs/logs/access_log
33156   /VOLUME01/GFS01-DEV_1/proxy_logs/logs
33156   /VOLUME01/GFS01-DEV_1/proxy_logs
33160   /VOLUME01/GFS01-DEV_1
33164   /VOLUME01
---files on node 2----
4       /VOLUME01/GFS02-DEV_1/proxy_logs/logs/error_log
33216   /VOLUME01/GFS02-DEV_1/proxy_logs/logs/access_log
33220   /VOLUME01/GFS02-DEV_1/proxy_logs/logs
33220   /VOLUME01/GFS02-DEV_1/proxy_logs
33224   /VOLUME01/GFS02-DEV_1
33228   /VOLUME01
---files on node 3----
0       /VOLUME01/GFS03-DEV_1/proxy_logs/logs
0       /VOLUME01/GFS03-DEV_1/proxy_logs
0       /VOLUME01/GFS03-DEV_1/prova.4
0       /VOLUME01/GFS03-DEV_1
4       /VOLUME01
---files on node 4----
0       /VOLUME01/GFS04-DEV_1/proxy_logs/logs
0       /VOLUME01/GFS04-DEV_1/proxy_logs
0       /VOLUME01/GFS04-DEV_1/prova.4
0       /VOLUME01/GFS04-DEV_1

So, if I reboot node 01GFS mount hangs. df doesn't works.If I reboot node 02 I have performance problems. df works but slowly.

When the problem occurs we much of the apache processes trying to log as we can see from the server-status
Current Time: Thursday, 03-May-2012 11:52:34 CEST
Restart Time: Thursday, 03-May-2012 11:44:56 CEST
Parent Server Generation: 0
Server uptime: 7 minutes 37 seconds
Total accesses: 178751 - Total Traffic: 134.8 MB
CPU Usage: u33.7 s12.25 cu0 cs0 - 10.1% CPU load
391 requests/sec - 302.0 kB/second - 790 B/request
512 requests currently being processed, 0 idle workers
LLLLLLCCLLLLLLLLLLLLLLLLCLLLLLLLLCLCLLLLLLCLLCCCLLLLLLLLCLLLLLLL
LLLLLLLCLCLLLLLLLLLLLLLLLLCLLLLLLCLLWCLLLCLCCCLLLLCLLLLLLLCLLLLL
LLLCLLLLLLLLLLLLLLCLCLLCLLLLLLLLLLLLLLLLLLLLLLLLWLLLLLLLLLLLLLLL
LLLLLLLLLLLLLLLCLLLLLLLLLCLCLLLLLLLLLLLLLLLLLCWLLLLLLLLLLLLLLLLL
LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
LLLLLLLLLLRLLLLWLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
LLLLLLLLLLLLLLLWLLLLLLLLLLLLLLLLLWLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
LLLLLLLLLLLLLLLLLLRLLLLLLLLLLLLLLLLLLLLLLLCLWLLLLLLLLLLLLLLLLLLL
and the reply rate goes down ...

[root at proxycoll02 src]# ./httperf -v --server=10......110 --port=80 --uri=/perl/stress_test.pl --num-conns=10000000 --rate=1000
httperf --verbose --client=0/1 --server=10.......110 --port=80 --uri=/perl/stress_test.pl --rate=1000 --send-buffer=4096 --recv-buffer=16384 --num-conns=10000000 --num-calls=1
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
httperf: maximum number of open descriptors = 1024
reply-rate = 1001.1
reply-rate = 1000.3
reply-rate = 999.7
reply-rate = 1000.3
reply-rate = 1000.1
reply-rate = 1000.3
reply-rate = 1000.1
reply-rate = 998.1
reply-rate = 328.0
reply-rate = 17.6
reply-rate = 25.6
reply-rate = 25.6
reply-rate = 7.6
reply-rate = 0.0
reply-rate = 0.0
reply-rate = 0.0
reply-rate = 0.0
reply-rate = 0.0
reply-rate = 0.0
reply-rate = 0.0
reply-rate = 171.0
reply-rate = 230.6
reply-rate = 0.2
reply-rate = 0.0
reply-rate = 200.6
reply-rate = 200.6
reply-rate = 151.8
reply-rate = 49.0
reply-rate = 199.8
reply-rate = 0.2
reply-rate = 201.2

Furthermore, during the problem the client is swapping

top - 16:26:07 up 22 min,  1 user,  load average: 619.87, 190.81, 76.03
Tasks: 1069 total,   6 running, 1063 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.1%us, 10.3%sy,  0.0%ni, 83.0%id,  5.5%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   1016524k total,  1008364k used,     8160k free,      212k buffers
Swap:  2064376k total,  2064376k used,        0k free,     3700k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3468 apache    20   0  229m 4248 1244 D  3.0  0.4   0:00.76 httpd
   25 root      20   0     0    0    0 D  2.0  0.0   0:09.31 kswapd0
 1806 root      20   0 4440m  36m  340 R  1.8  3.6   3:13.39 glusterfs
 3487 apache    20   0  219m 3620 1416 D  1.4  0.4   0:00.20 httpd
 3485 apache    20   0  214m 3616 1456 D  1.3  0.4   0:00.19 httpd
 3481 apache    20   0  219m 3624 1420 D  1.1  0.4   0:00.16 httpd
 3484 apache    20   0  217m 3580 1408 D  1.1  0.4   0:00.16 httpd
 2483 root      20   0 15784 1244  248 R  1.0  0.1   0:03.63 top


Is there something to investigate to better understand the hang problem or  tuning parameters to solve ?

Prima di stampare, pensa all'ambiente ** Think about the environment before printing

Il presente messaggio, inclusi gli eventuali allegati, ha natura aziendale e potrebbe contenere informazioni confidenziali e/o riservate. Chiunque lo ricevesse per errore, è pregato di avvisare tempestivamente il mittente e di cancellarlo.
E’ strettamente vietata qualsiasi forma di utilizzo, riproduzione o diffusione non autorizzata del contenuto di questo messaggio o di parte di esso.
Pur essendo state assunte le dovute precauzioni per ridurre al minimo il rischio di trasmissione di virus, si suggerisce di effettuare gli opportuni controlli sui documenti allegati al presente messaggio. Non si assume alcuna responsabilità per eventuali danni o perdite derivanti dalla presenza di virus.

Per lo svolgimento delle attività di investimento nel Regno Unito, la società è autorizzata da Banca d'Italia ed è soggetta alla vigilanza limitata della Financial Services Authority. Maggiori informazioni in merito ai poteri di vigilanza della Financial Services Authority sono a disposizione previa richiesta..

Nel Regno Unito Intesa Sanpaolo S.p.A. opera attraverso la  filiale di Londra, sita in 90 Queen Street, London EC4N 1SA, registrata in Inghilterra & Galles sotto No.FC016201, Branch No.BR000036

***

This email (including any attachment) is a corporate message and may contain confidential and/or privileged and/or proprietary information. If you have received this email in error, please notify the sender immediately, do not use or share it and destroy this email. Any unauthorised use, copying or disclosure of the material in this email or of parts hereof (including reliance thereon) is strictly forbidden.
We have taken precautions to minimize the risk of transmitting software viruses but nevertheless advise you to carry out your own virus checks on any attachment of this message. We accept no liability for loss or damage caused by software viruses.

For the conduct of investment business in the UK, the Company is authorised by Banca d’Italia and subject to limited regulation in the UK by the Financial Services Authority. Details about the extent of our regulation by the Financial Services Authority are available from us on request.

In the UK Intesa Sanpaolo S.p.A. operates  through its London Branch, located at 90 Queen Street, London EC4N 1SA. Registered in England & Wales under No.FC016201, Branch No.BR000036


More information about the Gluster-users mailing list