[Bugs] [Bug 1438051] New: Brick Multiplexing: Volume status still shows the PID even after killing the process
bugzilla at redhat.com
bugzilla at redhat.com
Fri Mar 31 18:18:11 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1438051
Bug ID: 1438051
Summary: Brick Multiplexing:Volume status still shows the PID
even after killing the process
Product: Red Hat Gluster Storage
Version: 3.3
Component: glusterd
Severity: medium
Assignee: amukherj at redhat.com
Reporter: amukherj at redhat.com
QA Contact: bmekala at redhat.com
CC: bugs at gluster.org, jeff at pl.atyp.us,
nchilaka at redhat.com, rhs-bugs at redhat.com,
sasundar at redhat.com, storage-qa-internal at redhat.com,
vbellur at redhat.com
Depends On: 1434448, 1437494
+++ This bug was initially created as a clone of Bug #1437494 +++
+++ This bug was initially created as a clone of Bug #1434448 +++
Description of problem:
==================
After enabling brick multiplexing, I killed the brick process(which is
universal for that node for all bricks of all volumes) on one of the node.
I see that the process gets killed and all bricks show the online status and
port number as N or N/A
However it still shows the old PID of the killed process
This PID also should be shown as N
root at dhcp35-215 bricks]# gluster v status|grep 215
Before kill the brick process(grep'ing only for bricks in this local node)
Brick 10.70.35.215:/rhs/brick3/cross3 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick4/cross3 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick1/ecvol 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick2/ecvol 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick3/ecvol 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick4/ecvol 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick1/ecx 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick2/ecx 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick3/ecx 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick4/ecx 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick3/rep2 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick4/rep2 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick3/rep3 49152 0 Y 13072
Brick 10.70.35.215:/rhs/brick4/rep3 49152 0 Y 13072
[root at dhcp35-215 bricks]# kill -9 13072
[root at dhcp35-215 bricks]# gluster v status|grep 215
(after kill the brick process)
Brick 10.70.35.215:/rhs/brick3/cross3 N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick4/cross3 N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick1/ecvol N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick2/ecvol N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick3/ecvol N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick4/ecvol N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick1/ecx N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick2/ecx N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick3/ecx N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick4/ecx N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick3/rep2 N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick4/rep2 N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick3/rep3 N/A N/A N 13072
Brick 10.70.35.215:/rhs/brick4/rep3 N/A N/A N 13072
[root at dhcp35-215 bricks]# ps -ef|grep 13072
root 2258 21234 0 19:35 pts/0 00:00:00 grep --color=auto 13072
[root at dhcp35-215 bricks]#
Version-Release number of selected component (if applicable):
============
glusterfs-libs-3.10.0-1.el7.x86_64
glusterfs-api-3.10.0-1.el7.x86_64
glusterfs-rdma-3.10.0-1.el7.x86_64
glusterfs-3.10.0-1.el7.x86_64
python2-gluster-3.10.0-1.el7.x86_64
glusterfs-fuse-3.10.0-1.el7.x86_64
glusterfs-server-3.10.0-1.el7.x86_64
glusterfs-geo-replication-3.10.0-1.el7.x86_64
glusterfs-extra-xlators-3.10.0-1.el7.x86_64
glusterfs-client-xlators-3.10.0-1.el7.x86_64
glusterfs-cli-3.10.0-1.el7.x86_64
How reproducible:
=======
always
Steps to Reproduce:
1.enabled brick multiplexing feature
2.create a volume or multiple volume and start them
3.you can notice all bricks hosted on the same node will be having same PID
4. select a node and kill the PID
5. issue volume status
Actual results:
====
volume status still shows the PID against each brick even though the PID is
killed
Expected results:
================
PID must show as N/A
--- Additional comment from Jeff Darcy on 2017-03-21 11:16:58 EDT ---
I would say that killing a process is an invalid test, but this probably needs
to be fixed anyway.
--- Additional comment from Worker Ant on 2017-03-30 08:24:48 EDT ---
REVIEW: https://review.gluster.org/16971 (glusterd: reset pid to -1 if brick is
not online) posted (#1) for review on master by Atin Mukherjee
(amukherj at redhat.com)
--- Additional comment from Worker Ant on 2017-03-31 09:06:25 EDT ---
COMMIT: https://review.gluster.org/16971 committed in master by Jeff Darcy
(jeff at pl.atyp.us)
------
commit e325479cf222d2f25dbc0a4c6b80bfe5a7f09f43
Author: Atin Mukherjee <amukherj at redhat.com>
Date: Thu Mar 30 14:47:45 2017 +0530
glusterd: reset pid to -1 if brick is not online
While populating brick details in gluster volume status response payload
if a brick is not online then pid should be reset back to -1 so that
volume status output doesn't show up the pid which was not cleaned up
especially with brick multiplexing where multiple bricks belong to same
process.
Change-Id: Iba346da9a8cb5b5f5dd38031d4c5ef2097808387
BUG: 1437494
Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
Reviewed-on: https://review.gluster.org/16971
Smoke: Gluster Build System <jenkins at build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Gaurav Yadav <gyadav at redhat.com>
Reviewed-by: Prashanth Pai <ppai at redhat.com>
Reviewed-by: Jeff Darcy <jeff at pl.atyp.us>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1434448
[Bug 1434448] Brick Multiplexing:Volume status still shows the PID even
after killing the process
https://bugzilla.redhat.com/show_bug.cgi?id=1437494
[Bug 1437494] Brick Multiplexing:Volume status still shows the PID even
after killing the process
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=QkcAkFvODC&a=cc_unsubscribe
More information about the Bugs
mailing list