[Gluster-devel] request for comments

Tue May 1 16:58:42 UTC 2007

Same here, and the loadbalance sounds like a great addition.

Thanks,

Brent

On Tue, 1 May 2007, Majied Najjar wrote:

> That makes all the sense in the world to me to have replication on the 
> server side.  I especially like the idea about network failover and not 
> having to depend on client mounts to maintain consistency on the server 
> side.
>
> Majied
>
> On Tue, 1 May 2007 09:05:28 -0700
> Anand Avati <avati at zresearch.com> wrote:
>
>>
>> here is a design proposal about some changes to afr and related.
>> currently AFR is totally handled on the client side, where the client
>> does the replication as well as failover. the AFR translator
>> essentially is doing _two_ features - 1. replication 2. failover.
>>
>> In view of the recent race condition discussed about AFR in the mailing
>> list (two clients writing to the same region running into a race while
>> writing to second mirror) and for other benefits mentioned below, the
>> proposal is to split replication and failover into two seperate
>> translators. replication is meant to be loaded on the server side
>> while failover alone is meant to be loaded on the client side.
>>
>> imagine grouping your storage cluster into pairs or triplets or
>> quadriplets. the AFR translator will be loaded to form these groups,
>> but on the server side. each memeber of the (say) triplet will load
>> AFR with one child as the storage/posix and the other two children as
>> protocol/clients for the auxillary export of the remaining two
>> servers. thus the effect is,
>>
>> * when you write to one server, it goes to all the three (redundancy)
>> * and, you can write via any server (used for failover)
>>
>> under normal situation, the failover at client uses 'primary child'
>> (the non-auxillary export server) and opeartions are performed only on
>> that child. the server side takes care of replication. when the server
>> goes down failover detects broken link and uses the aux export.
>>
>> advantages:
>>
>> 1. since a file is replicated by a signle agent, no potential race
>> conditions (most important)
>>
>> 2. the failover abstraction works for nonAFR scenarios also. you can
>> use the failover translator to failover between two network links to
>> the same server. (generally use infiniband, but failover to gigabit
>> totally seemlessly, even preserving open FDs)
>>
>> 3. client writes to only one server, tremendous saving of bandwidth
>> on the link between client and server.
>>
>> 4. self-heal checks can be performed in a more deterministic manner
>> since it is done by the 'primary chld' server. there are no
>> questions like 'what if two children try to heal together' or 'what if
>> no client is mounted at all'
>>
>> 5. extensions to AFR (like very-lazy replication, on close()) will be
>> lot easier. client submits a write to any server and forgets.
>>
>> 6. possible to implment 'transaction replay' kind of features easier
>> by preserving unwritten write() data with offset etc. on the server itslef
>> (doing such things with AFR on the client is unreliable since client can
>> always umount off)
>>
>> 7. on client side failover is not the only way, even 'loadbalance'
>> translator will be a good choice (wich takes care of not scheduling
>> calls to the link which is down). thus AFR will work hand-in-hand with
>> failover and/or loadbalancing, howoever the user prefers. (ofcourse
>> the loadbalance will work with its own abstraction where you can use
>> it just to loadbalance network links (remember somebody asking this on
>> the mailing list))
>>
>> my instinct tells me there are more advantages i can list if i think
>> over more.
>>
>> i feel failover and loadbalancer as generic layer will add lot of
>> power and possiblity for creative use, and AFR leveraging on that fits
>> in overall nicely.
>>
>> suggestions/comments ?
>>
>>
>> avati
>>
>> --
>> ultimate_answer_t
>> deep_thought (void)
>> {
>>   sleep (years2secs (7500000));
>>   return 42;
>> }
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>