[Bugs] [Bug 1453087] New: Brick Multiplexing: On reboot of a node Brick multiplexing feature lost on that node as multiple brick processes get spawned
bugzilla at redhat.com
bugzilla at redhat.com
Mon May 22 06:18:01 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1453087
Bug ID: 1453087
Summary: Brick Multiplexing: On reboot of a node Brick
multiplexing feature lost on that node as multiple
brick processes get spawned
Product: GlusterFS
Version: 3.10
Component: glusterd
Severity: urgent
Assignee: bugs at gluster.org
Reporter: sbairagy at redhat.com
CC: amukherj at redhat.com, bugs at gluster.org,
nchilaka at redhat.com, rhinduja at redhat.com,
rhs-bugs at redhat.com, storage-qa-internal at redhat.com
Depends On: 1451248
Blocks: 1450889, 1453086
+++ This bug was initially created as a clone of Bug #1451248 +++
+++ This bug was initially created as a clone of Bug #1450889 +++
Description of problem:
========================
When you reboot a node with brick mux enabled and multi volume setup, I see
that many glusterfsd are spawned and hence we lose the brick mux feature/
Version-Release number of selected component (if applicable):
========
3.8.4-25
How reproducible:
========
always
Steps to Reproduce:
1.have a 3 node setup with brick mux enabled, and vols say from v1..v10 with
each volume being a 1x3 and one brick per node (all independent LVs)
2.we can see that only one glusterfsd per node exists
3.now reboot node1
4. on successful reboot, following is the status
Last login: Mon May 15 15:56:55 2017 from dhcp35-77.lab.eng.blr.redhat.com
[root at dhcp35-45 ~]# ps -ef|grep glusterfsd
root 4693 1 42 15:56 ? 00:02:07 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 1.10.70.35.45.rhs-brick1-1 -p
/var/lib/glusterd/vols/1/run/10.70.35.45-rhs-brick1-1.pid -S
/var/run/gluster/a19832cf9844ad10112aba39eba569a6.socket --brick-name
/rhs/brick1/1 -l /var/log/glusterfs/bricks/rhs-brick1-1.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49152
--xlator-option 1-server.listen-port=49152
root 4701 1 0 15:56 ? 00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 10.10.70.35.45.rhs-brick10-10 -p
/var/lib/glusterd/vols/10/run/10.70.35.45-rhs-brick10-10.pid -S
/var/run/gluster/fd40f022ab677d36e57793a60cc16166.socket --brick-name
/rhs/brick10/10 -l /var/log/glusterfs/bricks/rhs-brick10-10.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49153
--xlator-option 10-server.listen-port=49153
root 4709 1 0 15:56 ? 00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 2.10.70.35.45.rhs-brick2-2 -p
/var/lib/glusterd/vols/2/run/10.70.35.45-rhs-brick2-2.pid -S
/var/run/gluster/898f4e556d871cfb1613d6ff121bd5e6.socket --brick-name
/rhs/brick2/2 -l /var/log/glusterfs/bricks/rhs-brick2-2.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49154
--xlator-option 2-server.listen-port=49154
root 4719 1 0 15:56 ? 00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 3.10.70.35.45.rhs-brick3-3 -p
/var/lib/glusterd/vols/3/run/10.70.35.45-rhs-brick3-3.pid -S
/var/run/gluster/af3354d92921146c0e8d3bebdcbec907.socket --brick-name
/rhs/brick3/3 -l /var/log/glusterfs/bricks/rhs-brick3-3.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49155
--xlator-option 3-server.listen-port=49155
root 4728 1 44 15:56 ? 00:02:13 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 4.10.70.35.45.rhs-brick4-4 -p
/var/lib/glusterd/vols/4/run/10.70.35.45-rhs-brick4-4.pid -S
/var/run/gluster/cafb15e7ed1d462ddf513e7cf80ca718.socket --brick-name
/rhs/brick4/4 -l /var/log/glusterfs/bricks/rhs-brick4-4.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49156
--xlator-option 4-server.listen-port=49156
root 4734 1 0 15:56 ? 00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 5.10.70.35.45.rhs-brick5-5 -p
/var/lib/glusterd/vols/5/run/10.70.35.45-rhs-brick5-5.pid -S
/var/run/gluster/5a92ed518f554fe96a3c3f4a1ecf5cb3.socket --brick-name
/rhs/brick5/5 -l /var/log/glusterfs/bricks/rhs-brick5-5.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49157
--xlator-option 5-server.listen-port=49157
--- Additional comment from Worker Ant on 2017-05-16 07:05:13 EDT ---
REVIEW: https://review.gluster.org/17307 (glusterd: Don't spawn new glusterfsds
on node reboot with brick-mux) posted (#1) for review on master by Samikshan
Bairagya (samikshan at gmail.com)
--- Additional comment from Worker Ant on 2017-05-17 16:37:34 EDT ---
REVIEW: https://review.gluster.org/17307 (glusterd: Don't spawn new glusterfsds
on node reboot with brick-mux) posted (#2) for review on master by Samikshan
Bairagya (samikshan at gmail.com)
--- Additional comment from Worker Ant on 2017-05-18 07:56:46 EDT ---
REVIEW: https://review.gluster.org/17307 (glusterd: Don't spawn new glusterfsds
on node reboot with brick-mux) posted (#3) for review on master by Samikshan
Bairagya (samikshan at gmail.com)
--- Additional comment from Worker Ant on 2017-05-18 12:45:32 EDT ---
COMMIT: https://review.gluster.org/17307 committed in master by Jeff Darcy
(jeff at pl.atyp.us)
------
commit 13e7b3b354a252ad4065f7b2f0f805c40a3c5d18
Author: Samikshan Bairagya <samikshan at gmail.com>
Date: Tue May 16 15:07:21 2017 +0530
glusterd: Don't spawn new glusterfsds on node reboot with brick-mux
With brick multiplexing enabled, upon a node reboot new bricks were
not being attached to the first spawned brick process even though
there wasn't any compatibility issues.
The reason for this is that upon glusterd restart after a node
reboot, since brick services aren't running, glusterd starts the
bricks in a "no-wait" mode. So after a brick process is spawned for
the first brick, there isn't enough time for the corresponding pid
file to get populated with a value before the compatibilty check is
made for the next brick.
This commit solves this by iteratively waiting for the pidfile to be
populated in the brick compatibility comparison stage before checking
if the brick process is alive.
Change-Id: Ibd1f8e54c63e4bb04162143c9d70f09918a44aa4
BUG: 1451248
Signed-off-by: Samikshan Bairagya <samikshan at gmail.com>
Reviewed-on: https://review.gluster.org/17307
Reviewed-by: Atin Mukherjee <amukherj at redhat.com>
Smoke: Gluster Build System <jenkins at build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1450889
[Bug 1450889] Brick Multiplexing: On reboot of a node Brick multiplexing
feature lost on that node as multiple brick processes get spawned
https://bugzilla.redhat.com/show_bug.cgi?id=1451248
[Bug 1451248] Brick Multiplexing: On reboot of a node Brick multiplexing
feature lost on that node as multiple brick processes get spawned
https://bugzilla.redhat.com/show_bug.cgi?id=1453086
[Bug 1453086] Brick Multiplexing: On reboot of a node Brick multiplexing
feature lost on that node as multiple brick processes get spawned
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list