[Gluster-devel] HA failover question.

Wed Oct 17 13:35:25 UTC 2007

On 10/17/07, Chris Johnson <johnson at nmr.mgh.harvard.edu> wrote:
>
>       Hey all.
>
>       I'm getting an idea on what AFR does here.  Thank you for the
> documentration pointers.  We have a specific implementation in mind of
> course.  I still haven't figured out it if glusterfs can do this yet.
>
>       My definition of High Availability for a file system includes
> multiple points of access.  If an access node drops out I don't want my
> clients to even blink.  Here's what I want to do.
>
>       We have this monstrosity called a SATABeast.  Really big drive
> loaded
> RAID system with two DELLS frontending it, redundant paths yada yada.
> To the DELL nodes it just looks like a LOT of really big drives, /dev/sda
> to /dev/sdz in CentOS5.  It's important to note that the Beast looks
> identical to the DELL frontends.  It is identical, it's the same
> Beats.
>
>       What I'd like to do of course is to use the frontends in a
> failover capacity.  If one frontend fails for whatever reason I want
> the other to seamlessly take over the whole job so the clients don't
> even notice the loss.  And of course when the node comes back the load
> redistributes.  Doing AFR in this configuration would be great to.
> We've tried other forms of RAID and there are limitations as we all
> know.  Really like to find a way around them.
>
>       Can GlusterFS do this?  If so can someone provide me with client
> and server config files for this please?  Much appreciated.
>
>
Chris,

so both servers are accessing the same SATABeast to export the same
filesystem? If so, AFR is not what you are looking for. AFR will try to
replicate the files to both servers and they already to this in the
back-end. If you have two different iSCSI virtual disks to each server, them
AFR is what you are looking for.

Right now AFR will always read from the first available server. Not
splitting the read traffic. This will change with GlusterFS 1.4 (with the HA
translator, not sure though). But you can have two AFRs, each with one
server as it's first subvolume. It would be like (doing AFR on the client
side):

=== BEGIN CLIENT SPEC FILE ===
volume s1-b1
        type protocol/client
        option transport-type tcp/client
        option remote-host 172.16.0.1
        option remote-subvolume b1
        option transport-timeout 5
end-volume

volume s2-b1
        type protocol/client
        option transport-type tcp/client
        option remote-host 172.16.0.2
        option remote-subvolume b1
        option transport-timeout 5
end-volume

volume s1-b2
        type protocol/client
        option transport-type tcp/client
        option remote-host 172.16.0.1
        option remote-subvolume b2
        option transport-timeout 5
end-volume

volume s2-b2
        type protocol/client
        option transport-type tcp/client
        option remote-host 172.16.0.2
        option remote-subvolume b2
        option transport-timeout 5
end-volume

volume s1-bn
        type protocol/client
        option transport-type tcp/client
        option remote-host 172.16.0.1
        option remote-subvolume bn
        option transport-timeout 5
end-volume

volume afr1
        type cluster/afr
        subvolumes s1-b1 s2-b1
        option replicate *:2
end-volume

volume afr2
        type cluster/afr
        subvolumes s2-b2 s1-b2
        option replicate *:2
end-volume

volume unify
        type cluster/unify
        subvolumes afr1 afr2
        option namespace s1-bn
        option scheduler rr
        option rr.limits.min-free-disk 5
end-volume
=== END CLIENT SPEC FILE ===

With this you have to replication you need, plus sharing the read traffic
between the front-end storage servers. The write performance is always
limited to the minimum write performance of each server.

Please pay attention to the fact that you have a serious
single-point-of-failure. A fire, electrical problems, human error and many
other things can happen with that single SATABeast. I would have two. I
always pair everything. But I really liked that SATABeast and having 42
disks with only 4U.

What do you think?

Best regards,
Daniel