[Gluster-users] AFR w/ RRDNS failover - does it work or not ? (WAS: simple AFR setup, one server crashes, entire cluster becomes unusable ?)
Daniel Maher
dma+gluster at witbe.net
Tue Dec 9 09:29:42 UTC 2008
Keith Freedman wrote:
> the issue isn't reliability, it's availability.
>
> if a client only talks to one server and that server goes down then the
> client has nothing to 'fail over' to. however, if the client talks to
> both servers then if one goes down it'll keep talking to the other one.
Either the clients will honour the RRDNS and pick another server, or
they won't - unfortunately, we now have a case where two opposing
possibilities are being presented. To wit :
From the « Gotcha » page :
http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas
Applies to server side
[...]
"The clients connect only to 1 server. You would need to implement some
kind of load balancing or something either with round robin DNS [...]"
"If you have client1 connected to server1 and client2 connected to
server2, and then server2 goes down, so does client2. The cluster also
becomes unavailable."
Ok, that seems like a straightforward enough statement, however, if we
take a look back through the mailing list archives, we find a statment
from Mr. Anand Avati which suggests exactly the opposite :
http://lists.nongnu.org/archive/html/gluster-devel/2008-04/msg00007.html
[...]
"Or, put another way, if ClientA (by chance) resolves
roundrobin.gluster.local to 192.168.252.1, but .1 is currently down -
what happens ?
it will attempt on .2, and if that fails (or disconnects after a while),
it will attempt on .3, and once all the entries are used 'once', it will
do a fresh dns query. it does not honor dns refresh timeouts (yet)."
The remaining basic question then is this : does AFR w/ RRDNS failover
work or not ? If it does, then the « Gotcha » page should be updated,
/and/ further investigation is required to determine why it failed to
operate as advertised in my environment. If it does /not/, then the «
Gotcha » page should be updated, and the wiki page i wrote (based
largely on the suggestions of the developers) should likely be scrapped. :P
As always, thank you all for your continued discourse !
--
Daniel Maher <dma+gluster AT witbe DOT net>
More information about the Gluster-users
mailing list