[Bugs] [Bug 1540249] New: Gluster is trying to use a port outside documentation and firewalld' s glusterfs.xml

bugzilla at redhat.com bugzilla at redhat.com
Tue Jan 30 15:39:21 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1540249

            Bug ID: 1540249
           Summary: Gluster is trying to use a port outside documentation
                    and firewalld's glusterfs.xml
           Product: GlusterFS
           Version: 3.12
         Component: glusterd
          Assignee: bugs at gluster.org
          Reporter: devianca at gmail.com
                CC: bugs at gluster.org



Created attachment 1388503
  --> https://bugzilla.redhat.com/attachment.cgi?id=1388503&action=edit
this is a fresh log from a situation when a node is stuck (removed all logs
before run)

Description of problem:
Wrote a script to catch these situations in a reboot-loop where in my
replica-2, a node after reboot (statistically ~15%) cannot connect to other
node and stays like that forever in `gluster pool list` while at the same time
other node says that the first node is Connected and for this other node
everything is working perfectly.

I narrowed it down to firewalld denying connections over port 49151 and this is
causing it.

Node1: 10.250.1.2
Node2: 10.250.1.1
Both Centos7 with glusterfs.xml openned on 'gluster' interconnect zone
This is a quick paste of this exact situation when a node is stuck:

[root at ProdigyX ~]# uptime
 13:43:47 up 7 min,  1 user,  load average: 0,00, 0,04, 0,05
[root at ProdigyX ~]# systemctl is-active glusterd
active
[root at ProdigyX ~]# networkctl status bond1
● 9: bond1
   Link File: n/a
Network File: /etc/systemd/network/bond1.network
        Type: ether
       State: routable (configured)
      Driver: bonding
  HW Address: 26:bb:b5:40:75:92
         MTU: 9198
     Address: 10.250.1.2
<------------------------------------------------------------------------------------
[root at ProdigyX ~]# firewall-cmd --get-active-zones
gluster
  interfaces: bond1
[root at ProdigyX ~]# firewall-cmd --permanent --info-zone=gluster
gluster (active)
  target: default
  icmp-block-inversion: no
  interfaces: bond1
<-----------------------------------------------------------------------------------------
interface bond1
  sources:
  services: glusterfs
<---------------------------------------------------------------------------------------
whole service opened
  ports:
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

[root at ProdigyX ~]# cat /usr/lib/firewalld/services/glusterfs.xml
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>glusterfs-static</short>
<description>Default ports for gluster-distributed storage</description>
<port protocol="tcp" port="24007"/>    <!--For glusterd -->
<port protocol="tcp" port="24008"/>    <!--For glusterd RDMA port management
-->
<port protocol="tcp" port="24009"/>    <!--For glustereventsd -->
<port protocol="tcp" port="38465"/>    <!--Gluster NFS service -->
<port protocol="tcp" port="38466"/>    <!--Gluster NFS service -->
<port protocol="tcp" port="38467"/>    <!--Gluster NFS service -->
<port protocol="tcp" port="38468"/>    <!--Gluster NFS service -->
<port protocol="tcp" port="38469"/>    <!--Gluster NFS service -->
<port protocol="tcp" port="49152-49664"/>  <!--512 ports for bricks -->
<-------------------------------------- 49152 opened
</service>
[root at ProdigyX ~]# cat /var/log/messages | grep "10.250.1.1" | tail -10
<-------------------------------------- its seeking 49151 instead of 49152
Jan 30 13:42:11 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:16 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:17 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:18 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:19 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:22 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:23 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:28 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:31 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
Jan 30 13:43:40 ProdigyX kernel: STATE_INVALID_DROP: IN=bond1 OUT=
MAC=26:bb:b5:40:75:92:68:05:ca:69:9e:fc:08:00 SRC=10.250.1.1 DST=10.250.1.2
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=24007 DPT=49151
WINDOW=27438 RES=0x00 ACK SYN URGP=0
[root at ProdigyX ~]# lsof -P | grep ':49151'
glusterd  1164                 root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterd  1164                 root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glusterti 1164 1165            root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterti 1164 1165            root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustersi 1164 1166            root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustersi 1164 1166            root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glusterme 1164 1167            root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterme 1164 1167            root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustersp 1164 1168            root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustersp 1164 1168            root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustersp 1164 1169            root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustersp 1164 1169            root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glustergd 1164 1171            root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glustergd 1164 1171            root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
glusterep 1164 1172            root   13u     IPv4              38496       0t0
       TCP ProdigyX:49151->10.250.1.1:24007 (SYN_SENT)
glusterep 1164 1172            root   16u     IPv4              19066       0t0
       TCP ProdigyX:24007->10.250.1.1:49151 (ESTABLISHED)
[root at ProdigyX ~]# ss -tpl | grep 49151
[root at ProdigyX ~]# ss -tpl | grep 49152
LISTEN     0      10         *:49152                    *:*                    
users:(("glusterfsd",pid=1193,fd=11))
[root at ProdigyX ~]# gluster pool list
UUID                                    Hostname        State
xxx    10.250.1.1      Disconnected
<--------------------------------------------------------------------------
other node
yyy    localhost       Connected
[root at ProdigyX ~]# sleep 60
[root at ProdigyX ~]# systemctl stop firewalld
[root at ProdigyX ~]# sleep 60
[root at ProdigyX ~]# gluster pool list
UUID                                    Hostname        State
xxx    10.250.1.1      Connected
<-----------------------------------------------------------------------------
sometimes it works until here, sometimes need to also restart glusterd to make
it Connected
yyy    localhost       Connected

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list