[Gluster-devel] HA failover test unsuccessful (inaccessible mountpoint)

Guido Smit guido at comlog.nl
Thu Apr 3 07:02:51 UTC 2008


Daniel,

 From what I can see, your setup should work. Which version are you using?

Daniel Maher wrote:
> Hello all,
>
> First off, thanks for the great feedback i received during the course
> of the day so far.  I set up a four machine test network (two file
> servers, two clients) in order to evaluate Gluster for an upcoming
> upgrade / consolidation project we've got coming up.
>
> I based my configuration on the HA w/ 1.3 document on the wiki :
> http://www.gluster.org/docs/index.php/GlusterFS_1.3_High_Availability_Storage_with_GlusterFS
>
> Setting up the servers and clients was easy, and it worked immediately
> (which is quite a change from the usual problems one has with
> network-aware file systems).  Unfortunately, it failed on one crucial
> test : failover.
>
> Briefly stated, when i physically unplugged one of the two (mirrored)
> file servers from the network, the mountpoint on the clients became
> completely inaccessible.  Attempting to change to the directory, modify
> files, or even list the contents of the parent directory resulted in a
> hung terminal session.  This state remained until the unplugged file
> server was reattached to the network.
>
> I was under the impression that this would not be the case; indeed,
> from what i've read in the documentation, the mountpoint should have
> continued to be accessible (since the other file server was still alive
> and well).  Ideally, in an HA environment, having one of the
> storage nodes disappear should /not/ bring down the entire storage
> cluster.
>
> I'm curious to know if this is the expected behaviour (which i doubt),
> or if i've simply missed something in my configuration which would
> cause this (more likely ;) ).
>
> And now, for the gritty details...
>
> The four machines each have two network interfaces; eth0 is connected
> to the "general" network (192.168.0.*), and eth1 is connected to a
> physically distinct gigabit network (10.0.0.*), upon which only
> gluster-related interactions are meant to travel.
>
> A DNS zone called "storage-net.gfs" was set up, with each of the
> machines being assigned A-records within this zone (10.0.0.* /
> dfs[ABCD].storage-net.gfs).  dfs[AB] are the clients, and dfs[CD] are
> the servers.  Finally, "cluster.storage-net.gfs" was assigned
> round robin-style to dfs[CD] (again, as per the documentation).
>
> A graphical overview of the test network may be interesting :
> http://tinypic.info/files/xhvyldlesd8igvjt8yl1.png
>
> As i noted above, i followed the HA document to create both the server
> and client configurations.  The server configuration :
> http://pastebin.ca/967749
>
> And the client configuration :
> http://pastebin.ca/967754
>
>
>   

-- 
Met vriendelijke groet,

Guido Smit
ComLog B.V.

Televisieweg 133
1322 BE Almere
T. 036 5470500
F. 036 5470481



-- 
No virus found in this outgoing message.
Checked by AVG. 
Version: 7.5.519 / Virus Database: 269.22.5/1356 - Release Date: 4/2/2008 4:14 PM






More information about the Gluster-devel mailing list