[Gluster-users] Shared VM disk/image on gluster for redundancy?

Tue Jun 29 14:04:27 UTC 2010

On 6/29/10, Jeff Darcy <jdarcy at redhat.com> wrote:

> Not only would this work, but my impression is that it's a pretty common
> use of GlusterFS.  The one thing I'd add is that, if you already have or
> ever might have more than two application servers, you use cluster/nufa
> as well as cluster/replicate (which is what volgen's "RAID 1" actually
> does).  The basic idea here is to set up two (or more) subvolumes on
> each server, like this:
>
> 	srv0vol0	srv1vol0	srv2vol0
> 	srv0vol1	srv1vol1	srv2vol1
>
> Then you replicate "diagonally" with read-subvolume pointing to the top row:
>
> 	volume rep0
> 		type cluster/replicate
> 		option read-subvolume srv0vol0
> 		subvolumes srv0vol0 srv1vol1
> 	end-volume
>
> 	volume rep1
> 		type cluster/replicate
> 		option read-subvolume srv1vol0
> 		subvolumes srv1vol0 srv2vol1
> 	end-volume
>
> 	volume rep2
> 		type cluster/replicate
> 		option read-subvolume srv2vol0
> 		subvolumes srv2vol0 srv0vol1
> 	end-volume
>
> Lastly, you apply NUFA with "local-volume-name" on each node pointing to
> the replicated volume with its read-subvolume on the same machine.  So,
> on node 1:
>
> 	volume my_nufa
> 		type cluster/nufa
> 		option local-volume-name rep1
> 		subvolumes rep0 rep1 rep2
> 	end-volume
>
> With this type of configuration, files created on node 1 will be written
> to srv1vol0/srv2vol1 and read from srv1vol0.  Note that you don't need
> separate disks or anything to set up multiple volumes on a node; they
> can just be different directories, though if they're directories within
> the same local filesystem then "df" on the GlusterFS filesystem can be
> misleading.  Extending the approach from three servers to any N should
> be pretty obvious, and you can do the same thing with cluster/distribute
> instead of cluster/nufa (they actually use the same code) if strong
> locality is not a requirement.

Thank you very very much for the detailed explanation! Finally seeing
a definitive answer is really like light at the end of a long and
sleepless tunnel :)

> You're not really getting rid of heartbeat/failover delay, so much as
> relying on functionally equivalent behavior within GlusterFS.  Also,
> you'll still need some sort of heartbeat to detect that an application
> server has died.  Putting your images on GlusterFS makes it possible for
> guests on multiple machines to access them, but it's still a bad idea
> for them to do so simultaneously.

Switching image be manually done so it's not likely there will be
simultaneous access. Unless I can figure out if KVM has similar live
migration functionality as Xen's Remus.

So most likely I would run two or more physical machines with VM to
failover to each other to catch situations of a single machine
failure. Along with that a pair of storage server. In the case of a
total failure where both the primary & secondary VM dies physically,
roll in a new machine to load up the VM images still safe on the
gluster data servers.

So in this case would I be correct that my configuration, assuming a
basic 2 physical VM host server and 2 storage server would probably
look something like

volume rep0
	type cluster/replicate
	option read-subvolume vmsrv0vol0
	subvolumes vmsrv0vol0 datasrv0vol0 datasrv1vol0
end-volume

volume rep1
	type cluster/replicate
	option read-subvolume vmsrv1vol0
	subvolumes vmsrv1vol0 datasrv0vol0 datasrv1vol0
end-volume

volume my_nufa
	type cluster/nufa
	option local-volume-name rep0
	subvolumes rep0 rep1
end-volume

Or did I lose my way somewhere? :)
Does it make any sense to replicate across all 3 or should I simply
spec the VM servers with tiny drives and put everything on the gluster
storage which I suppose would impact performance severely?