simple AFR setup, one server crashes, entire cluster becomes unusable ?

Keith Freedman freedman at FreeFormIT.com
Tue Dec 9 11:11:57 UTC 2008

At 02:47 AM 12/9/2008, Stas Oskin wrote:
>What about using Wackamole and server side AFR?
>allows to set a P2P kind of fault tolerance, where remaining server 
>would take the IP of the crashed one. Then the client could continue 
>working with the remaining server.
>What do you think about this?

I think this would likely be fine.  the client would timeout then try 
to reconnect at which point it would connect to the other server.
Server-side AFR also keeps the clients out of the replication process 
which seems better to me.

>Also, can someone provide more info about server side - I remember I 
>only seen some config examples, but never any info how it actually works.

here's my server configs:

volume home1
   type storage/posix                   # POSIX FS translator
   option directory /gluster/home        # Export this directory

volume posix-locks-home1
   type features/posix-locks
   option mandatory on
   subvolumes home1

## Reference volume "home2" from remote server
volume home2
   type protocol/client                   # POSIX FS translator
   option transport-type tcp/client
   option remote-host       # IP address of remote host
   option remote-subvolume posix-locks-home1     # use home1 on remote host
   option transport-timeout 10

### Create automatic file replication
volume home
   type cluster/afr
   option read-subvolume posix-locks-home1
   subvolumes posix-locks-home1 home2

### Add network serving capability to above home.
volume server
   type protocol/server
   option transport-type tcp/server     # For TCP/IP transport
   subvolumes posix-locks-home1
   option auth.addr.posix-locks-home1.allow,

###I believe the following will do what you want, it's not exactly 
the same as mine since I added the auth option for the clients 
(192.168.1.x) to mount home--the AFR volume
   option auth.addr.home.allow,, #

