[Gluster-devel] Splitbrain Resolution

Fri Apr 18 21:57:13 UTC 2008

On Sat, Apr 19, 2008 at 8:32 AM, Gordan Bobic <gordan at bobich.net> wrote:

>  I'm not sure that it is that big a drawback. Using HA or RHCS to fail over
> the IP resources sounds like a pretty standard way to implement fail-over.
> The major drawback is lack of explicit load-balancing, but having said that,
> if you mount by hostname you'd still get round-robin DNS load balancing,
> which is probably good enough.

When AFR is loaded on the client and AFR subvolumes go offline, any
open files on the AFR volume are not affected -- the failover is
transparent. This is not the case with those other 'failover'
approaches. As I said, it depends on your use case and that could be
an acceptable tradeoff. If you load AFR on the server, and that server
goes offline, any clients with files open to it will get a "Transport
endpoint not connected" or some similar error, and they will have to
reopen the files etc.

>  Yes, but if you consider that in a typical setup you have an order of
> magnitude more client nodes than server nodes, that starts to add up pretty
> quickly.

Well, that really does depend on what you are using it for. If you are
building a compute cluster, then that probably is not the case.

>  Hmm... Now if writes could happen over multicast, that would be pretty
> cool, as the writes wouldn't scale inversely. But I'm guessing this isn't
> anywhere on the feature list (yet)...

Perhaps. I don't think it would be that easy, or nice to implement though.

>
>  Gordan

-- Samuel