[Gluster-users] Options to turn off/on for reliable virtual machine writes & write performance
RedShift
redshift at telenet.be
Sun Oct 6 07:40:19 UTC 2013
Hi all, I'm building a cluster to host virtual machines to ESXi hosts (using NFS). The point of the cluster is that it should survive an unclean node death (test scenario by hard removing disks or cutting power, etc...), by which I need to make sure all writes are completed on both nodes before gluster returns the operation as completed. For now, I have this:
gluster> volume info ha-ds1
Volume Name: ha-ds1
Type: Replicate
Volume ID: da2fb668-2f3e-4839-a5da-4a51d5fcba05
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.255.255.1:/vol/gluster/ha-ds1
Brick2: 10.255.255.2:/vol/gluster/ha-ds1
Options Reconfigured:
cluster.self-heal-daemon: on
performance.flush-behind: Off
network.frame-timeout: 30
network.ping-timeout: 15
cluster.heal-timeout: 300
gluster> volume status all detail
Status of volume: ha-ds1
------------------------------------------------------------------------------
Brick : Brick 10.255.255.1:/vol/gluster/ha-ds1
Port : 49153
Online : Y
Pid : 2252
File System : ext4
Device : /dev/mapper/stor--node1-gluster
Mount Options : rw,noatime,nodiratime,journal_checksum,data=journal,errors=panic,nodelalloc
Inode Size : 256
Disk Space Free : 219.3GB
Total Disk Space : 269.1GB
Inode Count : 17924096
Free Inodes : 17923263
------------------------------------------------------------------------------
Brick : Brick 10.255.255.2:/vol/gluster/ha-ds1
Port : 49152
Online : Y
Pid : 2319
File System : ext4
Device : /dev/mapper/stor--node2-gluster
Mount Options : rw,noatime,nodiratime,journal_checksum,data=journal,errors=panic,nodelalloc
Inode Size : 256
Disk Space Free : 221.3GB
Total Disk Space : 269.1GB
Inode Count : 17924096
Free Inodes : 17923162
gluster>
(I would also like to grab your attention to the mount options - are those OK or can I do better?)
Is this enough to garantuee a proper cluster failover (data is consistent at all times) to the second node without interruption to the virtual machines? In my testing it appears to be, but I want to make sure - maybe someone else has something to add or something to look out for?
Second, I'd like to improve the write performance of this cluster. Reads are good (> 110 MB/s, the ESXi servers are connected via gigabit so that'll be the maximum) but writes are only half that (~60 MB/s). The hardware can definitely do more - a simple dd 16 GB filewrite to the underlying filesystem nets ~227 MB/s. I gathered some statistics during sequential write tests, I see the load going to ~15 and some CPU usage but it looks like one CPU core is spendings its majority in IO wait. I know the hardware can perform better - are there any other places I should start looking?
Thanks,
Glenn
More information about the Gluster-users
mailing list