[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Greg Scott GregScott at infrasupport.com
Sat Jul 13 23:32:45 UTC 2013


Ok – starting on it now.  On this question:

➢ lastly, do you have a loopback interface (lo) on 127.0.0.1 and is localhost defined in /etc/hosts?

Yes.

[root at chicago-fw1 ~]# ip addr show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
[root at chicago-fw1 ~]#
[root at chicago-fw1 ~]# more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
[root at chicago-fw1 ~]#

And

[root at chicago-fw2 ~]# ip addr show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
[root at chicago-fw2 ~]#
[root at chicago-fw2 ~]# more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
[root at chicago-fw2 ~]#

- Greg

From: Joe Julian [mailto:joe at julianfamily.org] 
Sent: Saturday, July 13, 2013 4:28 PM
To: Greg Scott; 'gluster-users at gluster.org'
Subject: Re: [Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Huh.. this was in my sent folder... let's try again.

There's something missing from this picture. The logs show that the client is connecting to both servers, but it only shows the disconnection from one and claims that it's not connected to any bricks after that.

Here's the data I'd like to have you generate:

unmount the clients
gluster volume set firewall-scripts diagnostics.client-log-level DEBUG
gluster volume set firewall-scripts diagnostics.brick-log-level DEBUG
systemctl stop glusterd.service
truncate the client, glusterd, and server logs
systemctl start glusterd
mount /firewall-scripts
Do your iptables disconnect
telnet $this_host_ip 24007 # report whether or not it establishes a connection
ls /firewall-scripts
wait 42 seconds
ls /firewall-scripts
Remove the iptables rule
ls /firewall-scripts
tar up the logs and email them to me.

You can reset the log-level:

gluster volume reset firewall-scripts diagnostics.client-log-level
gluster volume reset firewall-scripts diagnostics.brick-log-level

lastly, do you have a loopback interface (lo) on 127.0.0.1 and is localhost defined in /etc/hosts?


More information about the Gluster-users mailing list