[Bugs] [Bug 1537362] glustershd/ glusterd is not using right port when connecting to glusterfsd process

Mon Apr 16 04:47:20 UTC 2018

https://bugzilla.redhat.com/show_bug.cgi?id=1537362


--- Comment #3 from zhou lin <zz.sh.cynthia at gmail.com> ---
(In reply to Worker Ant from comment #2)
> COMMIT: https://review.gluster.org/19263 committed in master by \"Atin
> Mukherjee\" <amukherj at redhat.com> with a commit message- glusterd: process
> pmap sign in only when port is marked as free
> 
> Because of some crazy race in volume start code path because of friend
> handshaking with volumes with quorum enabled we might end up into a situation
> where glusterd would start a brick and get a disconnect and then immediately
> try
> to start the same brick instance based on another friend update request. And
> then if for the very first brick even if the process doesn't come up at the
> end
> sign in event gets sent and we end up having two duplicate portmap entries
> for
> the same brick. Since in brick start we mark the previous port as free, its
> better to consider a sign in request as no op if the corresponding port type
> is
> marked as free.
> 
> Change-Id: I995c348c7b6988956d24b06bf3f09ab64280fc32
> BUG: 1537362
> Signed-off-by: Atin Mukherjee <amukherj at redhat.com>

this issue appeared again, in my local test env

gluster volume heal log info
Brick sn-0.local:/mnt/bricks/log/brick
Status: Transport endpoint is not connected
Number of entries: -

Brick sn-1.local:/mnt/bricks/log/brick
/master/fsaudit/internal-ssh.log 
/ 
/master/fsaudit/auth.log 
/blackboxes 
/master/fsaudit/alarms 
/rcploglib/sn-0/DHAAgent.level 
/rcploglib/sn-0/DVMAgent.level 
/master/syslog 
Status: Connected
Number of entries: 8

[root at sn-0:/root]
#

gluster volume heal mstate info
Brick sn-0.local:/mnt/bricks/mstate/brick
Status: Transport endpoint is not connected
Number of entries: -

Brick sn-1.local:/mnt/bricks/mstate/brick
/ 
Status: Connected
Number of entries: 1

[root at sn-0:/root]
#ps aux | grep glustershd | grep -v grep | wc -l     glustershd process exist.
1
[root at sn-0:/root]
#

_gluster volume status log
Status of volume: log
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sn-0.local:/mnt/bricks/log/brick      49157     0          Y       1402 
Brick sn-1.local:/mnt/bricks/log/brick      49155     0          Y       3817 
Self-heal Daemon on localhost               N/A       N/A        Y       1428 
Self-heal Daemon on sn-1.local              N/A       N/A        Y       20316
Self-heal Daemon on sn-2.local              N/A       N/A        Y       15565

Task Status of Volume log
------------------------------------------------------------------------------
There are no active volume tasks

[root at sn-0:/root]
#_gluster volume status mstate
Status of volume: mstate
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sn-0.local:/mnt/bricks/mstate/brick   49158     0          Y       1403 
Brick sn-1.local:/mnt/bricks/mstate/brick   49153     0          Y       2967 
Self-heal Daemon on localhost               N/A       N/A        Y       1428 
Self-heal Daemon on sn-1.local              N/A       N/A        Y       20316
Self-heal Daemon on sn-2.local              N/A       N/A        Y       15565

Task Status of Volume mstate
------------------------------------------------------------------------------
There are no active volume tasks

[root at sn-0:/root]

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=UvZVBLWmoC&a=cc_unsubscribe