[Gluster-users] RAID options for Gluster
Fernando Frediani (Qube)
fernando.frediani at qubenet.net
Fri Jun 15 10:14:02 UTC 2012
Right, it seems that using individual disks without RAID although possible isn't a good idea because of the non automation of disk replacement. Also there would be a problem with the maximum filesize.
Going to the idea of using RAID controllers would you think that for say 16 disks(or 12) Raid 5 would be fine given the data is already replicated somewhere in another node in a very unlikely event you loose a node.
Now in a node with more number of disk slots could create multiple Raid 5 logical volumes, but will Gluster be smart enough to not put replicated data on two logical volumes residing on the same node ?
I don't even consider using RAID 10 as that would be a big waste of space because as the data is already replicated between nodes, having it replicated on the disks it would drop the usable space to 1/4 of the Raw. If I have latency sensitive applications I wouldn't probably use Gluster for that, but something else. For hosting non performance intensive applications I think Gluster is fine. Also in a medium sized cluster it would give a good throughput when running backups for example.
But bottom line the maximum performance you get from a single file is what a single RAID logical volume where the file resides can do.
From: Brian Candler [mailto:B.Candler at pobox.com]
Sent: 14 June 2012 14:55
To: Fernando Frediani (Qube)
Cc: 'gluster-users at gluster.org'
Subject: Re: [Gluster-users] RAID options for Gluster
On Thu, Jun 14, 2012 at 11:06:32AM +0000, Fernando Frediani (Qube) wrote:
> No RAID (individual hot swappable disks):
> Each disk is a brick individually (server:/disk1, server:/disk2, etc)
> so no RAID controller is required. As the data is replicated if one
> fail the data must exist in another disk on another node.
> Cheaper to build as there is no cost for a expensive RAID controller.
Except that software (md) RAID is free and works with a HBA.
> Improved performance as writes have to be done only on a single disk
> not in the entire RAID5/6 Array.
> Make better usage of the Raw space as there is no disk for parity on a
> RAID 5/6
> If a failed disk gets replaced the data need to be replicated over the
> network (not a big deal if using Infiniband or 1Gbps+ Network)
> The biggest file size is the size of one disk if using a volume type
* You will probably need to write your own tools to monitor and notify you when a disk fails in the array (wherease there are easily-available existing tools for md RAID, including E-mail notifications and SNMP integration)
* The process of swapping a disk is not a simple hot-swap: you need to replace the failed drive, mkfs a new filesystem, and re-introduce it into the gluster volume. This is something you will need to document procedures for and test carefully, whereas RAID swaps are relatively no-brainer.
* For a large configuration with hundreds of drives, it can become ungainly to have a gluster volume with hundreds of bricks.
> RAID doesn’t scale well beyond ~16 disks
But you can groups your disks into multiple RAID volumes.
> Attaching a JBOD to a node and creating multiple RAID Arrays(or a
> single server with more disk slots) instead of adding a new node can
> save power(no need CPU, Memory, Motherboard), but having multiple
> bricks on the same node might happen the data is replicated inside the
> same node making the downtime of a node something critical, or does
> Gluster is smart to replicate data to a brick in a different node ?
It's not automatic, you configure it explicitly. If your replica count is 2 then you give it pairs of bricks, and data will be replicated onto each brick in the pair. It's your responsibility to ensure that those two bricks are on different servers, if high availability is your concern.
Another alternative to consider: RAID10 on each node. Eliminates the performance penalty of RAID5/6, indeed will give you improved read performance compared to single disks, but halves your available storage capacity.
You can of course mix-and-match. e.g. RAID5 for backup volumes; RAID10 for highly active read/write volumes; some gluster volumes are replicated and some are not, etc. This can become a management headache if it gets too complex though.
More information about the Gluster-users