[Gluster-users] Shared VM disk/image on gluster for redundancy?
Jeff Darcy
jdarcy at redhat.com
Tue Jun 29 13:10:28 UTC 2010
On 06/29/2010 06:23 AM, Emmanuel Noobadmin wrote:
> I've been trying to find a solution to achieve the following
> objective: minimum delay redundant network storage for virtualized
> server and think gluster might be what I need after throwing out
> options like Lustre, dmraid on Openfiler etc.
>
> The configuration in mind is currently this.
>
> Application Servers
> -> Runs a few VM guest OS
> -> Runs gluster client/server
> -> VM machine images then stored on mirrored gluster volumes.
>
> There will be two storage servers with physical RAID.
>
> The concept is that
> 1. physical RAID 1 catches single disk failure on the storage server
> 2. gluster mirror on the application server catches single machine
> failure of the storage servers
>
> . . .
>
> Would this work or am I missing something?
Not only would this work, but my impression is that it's a pretty common
use of GlusterFS. The one thing I'd add is that, if you already have or
ever might have more than two application servers, you use cluster/nufa
as well as cluster/replicate (which is what volgen's "RAID 1" actually
does). The basic idea here is to set up two (or more) subvolumes on
each server, like this:
srv0vol0 srv1vol0 srv2vol0
srv0vol1 srv1vol1 srv2vol1
Then you replicate "diagonally" with read-subvolume pointing to the top row:
volume rep0
type cluster/replicate
option read-subvolume srv0vol0
subvolumes srv0vol0 srv1vol1
end-volume
volume rep1
type cluster/replicate
option read-subvolume srv1vol0
subvolumes srv1vol0 srv2vol1
end-volume
volume rep2
type cluster/replicate
option read-subvolume srv2vol0
subvolumes srv2vol0 srv0vol1
end-volume
Lastly, you apply NUFA with "local-volume-name" on each node pointing to
the replicated volume with its read-subvolume on the same machine. So,
on node 1:
volume my_nufa
type cluster/nufa
option local-volume-name rep1
subvolumes rep0 rep1 rep2
end-volume
With this type of configuration, files created on node 1 will be written
to srv1vol0/srv2vol1 and read from srv1vol0. Note that you don't need
separate disks or anything to set up multiple volumes on a node; they
can just be different directories, though if they're directories within
the same local filesystem then "df" on the GlusterFS filesystem can be
misleading. Extending the approach from three servers to any N should
be pretty obvious, and you can do the same thing with cluster/distribute
instead of cluster/nufa (they actually use the same code) if strong
locality is not a requirement.
> 3. Avoids any problems caused by a heartbeat/failover delay.
> 4. If an application server die, the VM images are still on the
> gluster volumes, I can simply distribute the downed VMs to the other
> running application server by loading the images.
You're not really getting rid of heartbeat/failover delay, so much as
relying on functionally equivalent behavior within GlusterFS. Also,
you'll still need some sort of heartbeat to detect that an application
server has died. Putting your images on GlusterFS makes it possible for
guests on multiple machines to access them, but it's still a bad idea
for them to do so simultaneously.
More information about the Gluster-users
mailing list