[Bugs] [Bug 1537362] New: glustershd/ glusterd is not using right port when connecting to glusterfsd process

bugzilla at redhat.com bugzilla at redhat.com
Tue Jan 23 02:43:59 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1537362

            Bug ID: 1537362
           Summary: glustershd/glusterd is not using right port when
                    connecting to glusterfsd process
           Product: GlusterFS
           Version: mainline
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: amukherj at redhat.com
                CC: bugs at gluster.org, zz.sh.cynthia at gmail.com
        Depends On: 1537346



+++ This bug was initially created as a clone of Bug #1537346 +++

Description of problem:
sometimes after reboot one sn nodes 
The output of command “gluster v heal mstate info” shows
 [root at testsn-1:/var/log/glusterfs/bricks]
# gluster v heal mstate info
Brick testsn-0.local:/mnt/bricks/mstate/brick
/testas-0/var/lib/ntp/drift 
/testas-2/var/lib/ntp/drift 
/.install-done 
/testas-0/var/lib/ntp 
/testmn-1/var/lib/ntp 
/testas-2/var/lib/ntp 
/testmn-0/var/lib/ntp 
/testmn-1/var/lib/ntp/drift 
/testas-1/var/lib/ntp 
/testas-1/var/lib/ntp/drift 
/testmn-0/var/lib/ntp/drift 
Status: Connected
Number of entries: 11

Brick testsn-1.local:/mnt/bricks/mstate/brick
Status: Transport endpoint is not connected
Number of entries: -

glustershd can not connect to local brick process! when i check the glustershd
process i  find it always fail when trying to connect to glusterfsd process
with port 49155.
[2018-01-18 10:42:29.891811] I [rpc-clnt.c:1986:rpc_clnt_reconfig]
0-mstate-client-1: changing port to 49155 (from 0)
[2018-01-18 10:42:29.892120] E [socket.c:2369:socket_connect_finish]
0-mstate-client-1: connection to 192.168.1.3:49155 failed (Connection refused);
disconnecting 

however, from local mstate glusterfsd process, it is listenning on port 49153!

Version-Release number of selected component (if applicable):
glusterfs3.12.3

How reproducible:

reboot sn node
Steps to Reproduce:
1.reboot sn node
2.
3.

Actual results:
glustershd can not connected to one local glusterfsd brick process
this can be seen from the following netstat command output;
[root at testsn-1:/var/log/glusterfs/bricks]
# ps -ef | grep glustershd
root      1295     1  0 Jan18 ?        00:00:18 /usr/sbin/glusterfs -s
testsn-1.local --volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log
-S /var/run/gluster/178dba826edae38df4ba67f25beeb1e6.socket --xlator-option
*replicate*.node-uuid=9ccea6b1-4d81-4020-a4ba-ee6821268ba8
root     19900 27911  0 04:10 pts/1    00:00:00 grep glustershd
[root at testsn-1:/var/log/glusterfs/bricks]
# netstat -p | grep 1295
tcp        0      0 testsn-1.local:49098    testsn-0.local:49154    ESTABLISHED
1295/glusterfs
tcp        0      0 testsn-1.local:49099    testsn-0.local:49152    ESTABLISHED
1295/glusterfs
tcp        0      0 testsn-1.local:49140    testsn-1.local:24007    ESTABLISHED
1295/glusterfs
tcp        0      0 testsn-1.local:49097    testsn-0.local:49153    ESTABLISHED
1295/glusterfs
tcp        0      0 testsn-1.local:49096    testsn-0.local:49155    ESTABLISHED
1295/glusterfs
tcp        0      0 testsn-1.local:49120    testsn-1.local:49156    ESTABLISHED
1295/glusterfs
tcp        0      0 testsn-1.local:49121    testsn-1.local:49152    ESTABLISHED
1295/glusterfs
tcp        0      0 testsn-1.local:49126    testsn-1.local:49154    ESTABLISHED
1295/glusterfs
unix  3      [ ]         STREAM     CONNECTED      36264 1295/glusterfs     
/var/run/gluster/178dba826edae38df4ba67f25beeb1e6.socket
unix  2      [ ]         DGRAM                     36258 1295/glusterfs      

Expected results:
glustershd should be able to connected to local brick process

Additional info:
# gluster v status mstate
Status of volume: mstate
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick testsn-0.local:/mnt/bricks/mstate/bri
ck                                          49154     0          Y       1113 
Brick testsn-1.local:/mnt/bricks/mstate/bri
ck                                          49155     0          Y       1117 
Self-heal Daemon on localhost               N/A       N/A        Y       1295 
Self-heal Daemon on testsn-2.local          N/A       N/A        Y       1813 
Self-heal Daemon on testsn-0.local          N/A       N/A        Y       1135   

Task Status of Volume mstate
------------------------------------------------------------------------------
There are no active volume tasks

It is quite strange that the mstate brick process listen port is showed as
49155 in “gluster v heal status mstate” but showed 49153 in ps command!
[root at testsn-1:/var/log/glusterfs/bricks]
# ps -ef | grep -i glusterfsd | grep mstate
root      1117     1  0 Jan18 ?        00:00:05 /usr/sbin/glusterfsd -s
testsn-1.local --volfile-id mstate.testsn-1.local.mnt-bricks-mstate-brick -p
/var/run/gluster/vols/mstate/testsn-1.local-mnt-bricks-mstate-brick.pid -S
/var/run/gluster/b520b934b415e6a68776cc4852901a77.socket --brick-name
/mnt/bricks/mstate/brick -l
/var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option
*-posix.glusterd-uuid=9ccea6b1-4d81-4020-a4ba-ee6821268ba8 --brick-port 49153
--xlator-option mstate-server.listen-port=49153 --xlator-option
transport.socket.bind-address=testsn-1.local

--- Additional comment from Worker Ant on 2018-01-22 21:29:06 EST ---

REVIEW: https://review.gluster.org/19263 (glusterd: process pmap sign in only
when port is marked as free) posted (#3) for review on master by Atin Mukherjee


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1537346
[Bug 1537346] glustershd/glusterd is not using right port when connecting
to glusterfsd process
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list