[Gluster-devel] Splitbrain Resolution

Fri Apr 18 20:32:17 UTC 2008

Samuel Douglas wrote:
> On Sat, Apr 19, 2008 at 5:54 AM, Reinis Rozitis <r at roze.lv> wrote:
>>  That way the afr-copying is done on the "server" level..
>>
>>  Only drawback in this is your clients can mount only one server so if one
>> goes down you have to remount or implement some other HA method (like dns or
>> heartbeats)).
> 
> This is a fairly serious drawback, but it depends what you want AFR
> for. In our cluster system, we are probably using AFR with two
> children to provide file replication; until the HA translator is
> available, these are loaded on the client.

I'm not sure that it is that big a drawback. Using HA or RHCS to fail 
over the IP resources sounds like a pretty standard way to implement 
fail-over. The major drawback is lack of explicit load-balancing, but 
having said that, if you mount by hostname you'd still get round-robin 
DNS load balancing, which is probably good enough.

> As Gordan said, yes, the bandwidth does increase n-fold, but only for
> writes; and if you have gigabit ethernet and a decent switch this
> should not be too much of a problem. The data still has to go to all
> the replicated servers, so it is just the client's interconnect that
> has higher utilisation. Unless your workload is very heavy on writes,
> this shouldn't be a major problem.

Yes, but if you consider that in a typical setup you have an order of 
magnitude more client nodes than server nodes, that starts to add up 
pretty quickly.

> You can also use write-behind to aggregate writes, which can help
> avoid unnecessary overheads caused by making many small writes to the
> AFR volumes.

Hmm... Now if writes could happen over multicast, that would be pretty 
cool, as the writes wouldn't scale inversely. But I'm guessing this 
isn't anywhere on the feature list (yet)...

Gordan