[Gluster-devel] replication client or server side

Tue Oct 16 17:35:37 UTC 2007

Vincent Régnard wrote:
> Hi all,
>
> I am presently re-thinking the way we are using glusterfs 1.3.5 here. We
> are doing replication (*3 with 3 bricks) on client side to produce a
> small HA cluster. We are planning to extend the brick number. Drawing
> that again and looking at some examples on the wiki doing it another way
> (Kritical's tutorial), we are wondering wether doing the replication
> (AFR) on the server side (glusterfsd) would be more suitable than doing
> it on the client side ? Have you any experience or remark on that ? Does
> this have performance impact in your opinion ?
>
> If replication is transfered to server side, we'll have to use
> unification on client side to achive HA (and then obtain active
> self-heal?). Is this latter configuration reasonable ?
>
>
> Present configuration:
>
> Client stack:    FUSE
>         PERFORMANCE TRANSLATORS (write-b/io-cache/io-thread)
>         AFR
>         CLIENT TRANSPORT
>
> Server stack:    SERVER TRANSPORT
>         PERFORMANCE TRANSLATOR (io-thread)
>         POSIX LOCKS FEATURE
>         POSIX STORAGE
>
>
> Planned configuration:
>
> Client stack:    FUSE
>         PERFORMANCE TRANSLATORS (write-b/io-cache/io-thread)
>         UNIFY
>         CLIENT TRANSPORT
>
> Server stack:    SERVER TRANSPORT
>         PERFORMANCE TRANSLATOR (io-thread)
>         AFR
>         POSIX LOCKS FEATURE
>         POSIX STORAGE
>
> Vincent

I find using AFR and Unify from the client yields a more robust config 
with respect to high availability, but using unify on the client 
complicates the configs and file storage (it necessitates splitting the 
share between a main and AFR split per server).  It may be possible to 
overload the AFR definitions to get around this, I haven't tried that 
yet.  It's also possible that tweaking the timeout values for the client 
and server to make the server timeout before the client might yield a 
more stable config.

Performance wise, moving AFR to the server side will allow you structure 
the network for more performance, such as implementing a secondary 
network to handle all the AFR traffic.  As it is now (with you doing 
everything on the client), your writes are constrained to 1/3 of the 
total available network bandwidth, since you have to write each file 3 
times.  By moving the AFR to the server and implementing a second 
network to carry the AFR traffic, you could increase your theoretical 
network performance by 50% (if the AFR network is the same speed as the 
client network connection, and you want data stored on 3 servers).

It seems like every other day I think of a new way to set up glusterfs.  
I have to say this is the most fun I've had with a software product in 
some time.  ;)

-- 

-Kevan Benson
-A-1 Networks