[Bugs] [Bug 1245036] New: glusterd fails to peer probe if one of the node is behind the NAT.
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jul 21 06:17:58 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1245036
Bug ID: 1245036
Summary: glusterd fails to peer probe if one of the node is
behind the NAT.
Product: GlusterFS
Version: 3.7.1
Component: glusterd
Assignee: bugs at gluster.org
Reporter: hchiramm at redhat.com
CC: bugs at gluster.org, gluster-bugs at redhat.com
Description of problem:
Currently glusterd fails to establish successful 'peer probe' if one of the
node which is participating in peer probe is behind the NAT. For ex: containers
running in multiple hosts fails when it peer probe to form a trusted pool.
Test setup is configured with Atomic Hosts and 'flannel' for overlay networking
Test Setup:
Container-1 IP : 10.50.72.2 ( running on Worker-1 where Worker-1 is atomic
host1)
Container-2 IP : 10.50.97.2 ( running on Worker-2 where Worker-2 is atomic
host2)
PING from Container-1 to Container-2 works
SSH from Container-1 to Container-2 works.
The gluster pool list says:
Container-1:
--------------------------------------------------------------------------------------
-bash-4.3# ip a s eth0
5: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 02:42:0a:32:61:02 brd ff:ff:ff:ff:ff:ff
inet 10.50.97.2/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe32:6102/64 scope link
valid_lft forever preferred_lft forever
-bash-4.3# gluster pool list
UUID Hostname State
3c6bf65d-6a58-46ad-90d4-4e2d9b4dc80e 10.50.72.2 Connected
175daada-0ca4-4e18-b72b-460c9da19f96 localhost Connected
As you can see above, in Container -1 it says both gluster nodes are connected
and the peer probe is successful. However in Container-2, the remote node is in
"disconnected" status.
Container-2:
--------------------------------------------------------------------------------------
-bash-4.3# ip a s eth0
5: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 02:42:0a:32:48:02 brd ff:ff:ff:ff:ff:ff
inet 10.50.72.2/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe32:4802/64 scope link
valid_lft forever preferred_lft forever
-bash-4.3# gluster pool list
UUID Hostname State
175daada-0ca4-4e18-b72b-460c9da19f96 10.50.97.0 Disconnected
3c6bf65d-6a58-46ad-90d4-4e2d9b4dc80e localhost Connected
The below netstat output shows the "flannel" GW IP as the source IP in reverse
connection. which cause the glusterd to fail
-bash-4.3# netstat -ntp
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
PID/Program name
tcp 0 0 10.50.72.2:45442 202.255.47.226:80 TIME_WAIT
-
tcp 1 1 10.50.72.2:41806 140.138.144.170:80 LAST_ACK
-
tcp 0 0 10.50.72.2:22 10.50.72.1:55834 ESTABLISHED
146/sshd: root at pts/
tcp 0 0 10.50.72.2:58350 192.26.91.193:80 TIME_WAIT
-
tcp 0 0 10.50.72.2:24007 10.50.97.0:1022 ESTABLISHED
35/glusterd ---> flannel GW IP
tcp 1 1 10.50.72.2:49727 123.255.202.74:80 LAST_ACK
-
tcp 0 0 10.50.72.2:22 10.50.97.0:51955 ESTABLISHED
330/sshd: root at pts/ --> flannel GW IP
tcp 1 1 10.50.72.2:49723 123.255.202.74:80 LAST_ACK
-
tcp 1 1 10.50.72.2:49734 123.255.202.74:80 LAST_ACK
-
tcp 0 0 10.50.72.2:44396 103.22.220.133:80 TIME_WAIT
-
tcp 0 0 10.50.72.2:37028 212.138.64.22:80 TIME_WAIT
-
tcp 0 1 10.50.72.2:58308 137.189.4.14:80 LAST_ACK
-
As an additional info, the telnet from containers to 24007 works in both
direction.
-bash-4.3# ip a s eth0
5: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 02:42:0a:32:61:02 brd ff:ff:ff:ff:ff:ff
inet 10.50.97.2/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe32:6102/64 scope link
valid_lft forever preferred_lft forever
-bash-4.3# telnet 10.50.72.2 24007
Trying 10.50.72.2...
Connected to 10.50.72.2.
Escape character is '^]'.
^]
telnet> Connection closed.
-bash-4.3# ip a s eth0
5: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 02:42:0a:32:48:02 brd ff:ff:ff:ff:ff:ff
inet 10.50.72.2/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe32:4802/64 scope link
valid_lft forever preferred_lft forever
-bash-4.3# telnet 10.50.97.2 24007
Trying 10.50.97.2...
Connected to 10.50.97.2.
Escape character is '^]'.
^]
telnet> Connection closed.
Version-Release number of selected component (if applicable):
GlusterFS 3.7.2
How reproducible:
Always
Steps to Reproduce:
Same as above.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list