[Bugs] [Bug 1540249] New: Gluster is trying to use a port outside documentation and firewalld' s glusterfs.xml
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jan 30 15:39:21 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1540249
Bug ID: 1540249
Summary: Gluster is trying to use a port outside documentation
and firewalld's glusterfs.xml
Product: GlusterFS
Version: 3.12
Component: glusterd
Assignee: bugs at gluster.org
Reporter: devianca at gmail.com
CC: bugs at gluster.org
Created attachment 1388503
--> https://bugzilla.redhat.com/attachment.cgi?id=1388503&action=edit
this is a fresh log from a situation when a node is stuck (removed all logs
before run)
Description of problem:
Wrote a script to catch these situations in a reboot-loop where in my
replica-2, a node after reboot (statistically ~15%) cannot connect to other
node and stays like that forever in `gluster pool list` while at the same time
other node says that the first node is Connected and for this other node
everything is working perfectly.
I narrowed it down to firewalld denying connections over port 49151 and this is
causing it.
Node1: 10.250.1.2
Node2: 10.250.1.1
Both Centos7 with glusterfs.xml openned on 'gluster' interconnect zone
This is a quick paste of this exact situation when a node is stuck:
[root at ProdigyX ~]# uptime
13:43:47 up 7 min, 1 user, load average: 0,00, 0,04, 0,05
[root at ProdigyX ~]# systemctl is-active glusterd
active
[root at ProdigyX ~]# networkctl status bond1
● 9: bond1
Link File: n/a
Network File: /etc/systemd/network/bond1.network
Type: ether
State: routable (configured)
Driver: bonding
HW Address: 26:bb:b5:40:75:92
MTU: 9198
Address: 10.250.1.2
<------------------------------------------------------------------------------------
[root at ProdigyX ~]# firewall-cmd --get-active-zones
gluster
interfaces: bond1
[root at ProdigyX ~]# firewall-cmd --permanent --info-zone=gluster
gluster (active)
target: default
icmp-block-inversion: no
interfaces: bond1
<-----------------------------------------------------------------------------------------
interface bond1
sources:
services: glusterfs
<---------------------------------------------------------------------------------------
whole service opened
ports:
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
[root at ProdigyX ~]# cat /usr/lib/firewalld/services/glusterfs.xml
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>glusterfs-static</short>
<description>Default ports for gluster-distributed storage</description>
<port protocol="tcp" port="24007"/> <!--For glusterd -->
<port protocol="tcp" port="24008"/> <!--For glusterd RDMA port management
-->
<port protocol="tcp" port="24009"/> <!--For glustereventsd -->
<port protocol="tcp" port="38465"/> <!--Gluster NFS service -->
<port protocol="tcp" port="38466"/> <!--Gluster NFS service -->
<port protocol="tcp" port="38467"/> <!--Gluster NFS service -->
<port protocol="tcp" port="38468"/> <!--Gluster NFS service -->
<port protocol="tcp" port="38469"/> <!--Gluster NFS service -->
<port protocol="tcp" port="49152-49664"/> <!--512 ports for bricks -->
<-------------------------------------- 49152 opened
</service>
[root at ProdigyX ~]# cat /var/log/messages | grep "10.250.1.1" | tail -10
<-------------------------------------- its seeking 49151 instead of 49152
Jan 30 13:42:11 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:16 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:17 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:18 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:19 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:22 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:23 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:28 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:31 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:40 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
[root at ProdigyX ~]# lsof -P | grep ':49151'
glusterd 1164 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterd 1164 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glusterti 1164 1165 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterti 1164 1165 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustersi 1164 1166 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustersi 1164 1166 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glusterme 1164 1167 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterme 1164 1167 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustersp 1164 1168 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustersp 1164 1168 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustersp 1164 1169 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustersp 1164 1169 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustergd 1164 1171 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustergd 1164 1171 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glusterep 1164 1172 root 13u IPv4 38496 0t0
TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterep 1164 1172 root 16u IPv4 19066 0t0
TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
[root at ProdigyX ~]# ss -tpl | grep 49151
[root at ProdigyX ~]# ss -tpl | grep 49152
LISTEN 0 10 *:49152 *:*
users:(("glusterfsd",pid=1193,fd=11))
[root at ProdigyX ~]# gluster pool list
UUID Hostname State
xxx 10.250.1.1 Disconnected
<--------------------------------------------------------------------------
other node
yyy localhost Connected
[root at ProdigyX ~]# sleep 60
[root at ProdigyX ~]# systemctl stop firewalld
[root at ProdigyX ~]# sleep 60
[root at ProdigyX ~]# gluster pool list
UUID Hostname State
xxx 10.250.1.1 Connected
<-----------------------------------------------------------------------------
sometimes it works until here, sometimes need to also restart glusterd to make
it Connected
yyy localhost Connected
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list