[Gluster-users] "Granular locking" - does this need to be enabled in 3.3.0 ?
jog at mrc-lmb.cam.ac.uk
Thu Jul 19 09:14:00 UTC 2012
Dear Pranith /Anand ,
Update on our progress with using KVM & Gluster:
We built a two server (Dell R710) cluster, each box has...
5 x 500 GB SATA RAID5 array (software raid)
an Intel 10GB ethernet HBA.
One box has 8GB RAM, the other 48GB
both have 2 x E5520 Xeon
Centos 6.3 installed
Gluster 3.3 installed from the rpm files on the gluster site
1) create a replicated gluster volume (on top of xfs)
2) setup qemu/kvm with a gluster volume (mounts localhost:/gluster-vol)
3) sanlock configured (this is evil!)
4) build a virtual machines with 30GB qcow2 image, 1GB RAM
5) clone this VM into 4 machines
6) check that live migration works (OK)
Start basic test cycle:
a) migrate all machines to host #1, then reboot host #2
b) watch logs for self-heal to complete
c) migrate VM's to host #2, reboot host #1
d) check logs for self heal
The above cycle can be repeated numerous times, and completes without
error, provided that no (or little) load is on the VM.
If I give the VM's a work load, such by running "bonnie++" on each VM,
things start to break.
1) it becomes almost impossible to log in to each VM
2) the kernel on each VM starts giving timeout errors
i.e. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
3) top / uptime on the hosts shows load average of up to 24
4) dd write speed (block size 1K) to gluster is around 3MB/s on the host
While I agree that running bonnie++ on four VM's is possibly unfair,
there are load spikes on quiet machines (yum updates etc). I suspect
that the I/O of one VM starts blocking that of another VM, and the
pressure builds up rapidly on gluster - which does not seem to cope well
under pressure. Possibly this is the access pattern / block size of
I'm (slightly) disappointed.
Though it doesn't corrupt data, the I/O performance is < 1% of my
hardwares capability. Hopefully work on buffering and other tuning will
fix this ? Or maybe the work mentioned getting qemu talking directly to
gluster will fix this?
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Hills Road, Cambridge, CB2 0QH, UK.
Phone 01223 402219
Mobile 0776 9886539
More information about the Gluster-users