[Gluster-devel] HA, GlusterFS server protocol and LDirectorD
Majied Najjar
majied.najjar at nationalnet.com
Wed Sep 12 02:57:56 UTC 2007
Hi Geoff,
Actually, I was thinking of something a bit simpler:
/etc/ha.d/haresources:
somehost \
IPaddr2::10.24.0.254/24/eth1/10.24.0.255
You could run the daemon on both machines at the same time. The only
thing that would designate a host as "master" would be the failover IP
which would be in this case "10.10.0.254".
Majied
Geoff Kassel wrote:
> Hi Majied,
>
>
>> With Heartbeat, the only thing required would be a "failover IP" handled
>> between the two Heartbeat servers running the glusterfsd process using
>> AFR. When a glusterfsd server running Heartbeat goes down, the other
>> Heartbeat server would take over the failover IP and continue service to
>> the glusterfs clients.
>>
>
> I think I see what you're getting at here - haresources on both machines
> something like:
>
> node1 192.168.0.1 glusterfsd
> node2 192.168.0.2 glusterfsd
>
> Am I correct?
>
> I haven't had much luck with getting services running through Heartbeat in the
> past (Gentoo's initscripts not being directly compatible with Heartbeat's
> status exit level requirements), but looking at the glusterfsd initscript it
> looks like it might work with Heartbeat.
>
> I'll give this a try, and see how I go. Thanks for the idea!
>
>
> I know this may be a bit off topic for the list (if there was a gluster-users
> list I'd move this there), but for anyone who was curious (I'm still curious
> about a solution to this myself) I was trying to configure Heartbeat +
> LDirectorD the following way:
>
> /etc/ha.d/haresources:
>
> node1 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master \
> IPaddr2::192.168.0.3/24/eth0/192.168.0.255
>
> # node2 is a live backup for failover on the Linux Virtual Server daemon
> # if node1 goes down
>
> etc/ha.d/ldirectord.cf:
>
> checktimeout=1
> checkinterval=1
> autoreload=yes
> logfile="/var/log/ldirectord.log"
> quiescent=yes
>
> virtual=192.168.0.3:6996
> real=192.168.0.1:6996 gate 1
> real=192.168.0.2:6996 gate 1
> checktype=connect
> scheduler=rr
> protocol=tcp
>
>
> The real glusterfsd servers are running on 192.168.0.1 (node1) and 192.168.0.2
> (node2), and clients connect to the virtual IP address, 192.168.0.3.
>
> However, ipvsadm after Heartbeat starts does not show either
> connection as up, even though telnet to both real servers on port 6996
> connects. If I configure a fallback (say, to 127.0.0.1:6996), I only ever get
> the fallback through 192.168.0.3, and if I stop that machine, any connections
> through 192.168.0.3 stop too.
>
> Heartbeat doesn't see that as a failure condition - with or without the
> fallback node - so the IP address and LVS won't fail over to the other node.
> I can't see a way to configure Heartbeat to do so either. Hence my question
> about finding a way to get LDirectorD to do the detection in a more robust
> request-response manner.
>
> Majied, do you or any one else on the list have a suggestion for what I may
> have missed?
>
> Thank you all in advance for any and all suggestions (including RTFM again :)
>
> Kind regards,
>
> Geoff Kassel.
>
> On Wed, 12 Sep 2007, Majied Najjar wrote:
>
>> Hi,
>>
>>
>> This is just my two cents. :-)
>>
>>
>> Instead of LDirectorD, I would recommend just using Heartbeat.
>>
>>
>> With Heartbeat, the only thing required would be a "failover IP" handled
>> between the two Heartbeat servers running the glusterfsd process using
>> AFR. When a glusterfsd server running Heartbeat goes down, the other
>> Heartbeat server would take over the failover IP and continue service to
>> the glusterfs clients.
>>
>>
>> Granted, this isn't loadbalancing between glusterfsd servers and only
>> handles failover....
>>
>>
>> Majied Najjar
>>
>> Geoff Kassel wrote:
>>
>>> Hi all,
>>> I'm trying to set up LDirectorD (through Heartbeat) to load-balance
>>> and failover client connections to GlusterFS server instances over TCP.
>>>
>>> First of all, I'm curious to find out if anyone else has attempted
>>> this, as I've had no luck with maintaining client continuity with
>>> round-robin DNS in /etc/hosts and client timeouts, as advised in previous
>>> posts and tutorials. The clients just go dead with 'Transport endpoint is
>>> not connected' messages.
>>>
>>> My main problem is that LDirectorD doesn't seem to recognize that a
>>> GlusterFS server is functional through the connection test method, so I
>>> can't detect if a server goes down. While LDirectorD does a
>>> request-response method of liveness detection, the GlusterFS protocol is
>>> unfortunately too lengthy to use in the configuration files. (It needs to
>>> be a request that can fit on a single line, it seems.)
>>>
>>> I'm wondering if there's a simple request-response connection test I
>>> haven't found yet that I can use to check for liveness of a server over
>>> TCP. If there isn't... could I make a feature request for such? Anything
>>> that can be done manually over a telnet connection to the port would be
>>> perfect.
>>>
>>> Thank you for GlusterFS, and thanks in advance for your time and
>>> effort in answering my question.
>>>
>>> Kind regards,
>>>
>>> Geoff Kassel.
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org
>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>
>
>
More information about the Gluster-devel
mailing list