[Bugs] [Bug 1453087] New: Brick Multiplexing: On reboot of a node Brick multiplexing feature lost on that node as multiple brick processes get spawned

bugzilla at redhat.com bugzilla at redhat.com
Mon May 22 06:18:01 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1453087

            Bug ID: 1453087
           Summary: Brick Multiplexing: On reboot of a node Brick
                    multiplexing feature lost on that node as multiple
                    brick processes get spawned
           Product: GlusterFS
           Version: 3.10
         Component: glusterd
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: sbairagy at redhat.com
                CC: amukherj at redhat.com, bugs at gluster.org,
                    nchilaka at redhat.com, rhinduja at redhat.com,
                    rhs-bugs at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1451248
            Blocks: 1450889, 1453086



+++ This bug was initially created as a clone of Bug #1451248 +++

+++ This bug was initially created as a clone of Bug #1450889 +++

Description of problem:
========================
When you reboot a node with brick mux enabled and multi volume setup, I see
that many glusterfsd are spawned and hence we lose the brick mux feature/


Version-Release number of selected component (if applicable):
========
3.8.4-25

How reproducible:
========
always

Steps to Reproduce:
1.have a 3 node setup with brick mux enabled, and vols say from v1..v10 with
each volume being a 1x3 and one brick per node (all independent LVs)
2.we can see that only one glusterfsd per node exists
3.now reboot node1
4. on successful reboot, following is the status 



Last login: Mon May 15 15:56:55 2017 from dhcp35-77.lab.eng.blr.redhat.com
[root at dhcp35-45 ~]# ps -ef|grep glusterfsd
root      4693     1 42 15:56 ?        00:02:07 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 1.10.70.35.45.rhs-brick1-1 -p
/var/lib/glusterd/vols/1/run/10.70.35.45-rhs-brick1-1.pid -S
/var/run/gluster/a19832cf9844ad10112aba39eba569a6.socket --brick-name
/rhs/brick1/1 -l /var/log/glusterfs/bricks/rhs-brick1-1.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49152
--xlator-option 1-server.listen-port=49152
root      4701     1  0 15:56 ?        00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 10.10.70.35.45.rhs-brick10-10 -p
/var/lib/glusterd/vols/10/run/10.70.35.45-rhs-brick10-10.pid -S
/var/run/gluster/fd40f022ab677d36e57793a60cc16166.socket --brick-name
/rhs/brick10/10 -l /var/log/glusterfs/bricks/rhs-brick10-10.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49153
--xlator-option 10-server.listen-port=49153
root      4709     1  0 15:56 ?        00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 2.10.70.35.45.rhs-brick2-2 -p
/var/lib/glusterd/vols/2/run/10.70.35.45-rhs-brick2-2.pid -S
/var/run/gluster/898f4e556d871cfb1613d6ff121bd5e6.socket --brick-name
/rhs/brick2/2 -l /var/log/glusterfs/bricks/rhs-brick2-2.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49154
--xlator-option 2-server.listen-port=49154
root      4719     1  0 15:56 ?        00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 3.10.70.35.45.rhs-brick3-3 -p
/var/lib/glusterd/vols/3/run/10.70.35.45-rhs-brick3-3.pid -S
/var/run/gluster/af3354d92921146c0e8d3bebdcbec907.socket --brick-name
/rhs/brick3/3 -l /var/log/glusterfs/bricks/rhs-brick3-3.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49155
--xlator-option 3-server.listen-port=49155
root      4728     1 44 15:56 ?        00:02:13 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 4.10.70.35.45.rhs-brick4-4 -p
/var/lib/glusterd/vols/4/run/10.70.35.45-rhs-brick4-4.pid -S
/var/run/gluster/cafb15e7ed1d462ddf513e7cf80ca718.socket --brick-name
/rhs/brick4/4 -l /var/log/glusterfs/bricks/rhs-brick4-4.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49156
--xlator-option 4-server.listen-port=49156
root      4734     1  0 15:56 ?        00:00:00 /usr/sbin/glusterfsd -s
10.70.35.45 --volfile-id 5.10.70.35.45.rhs-brick5-5 -p
/var/lib/glusterd/vols/5/run/10.70.35.45-rhs-brick5-5.pid -S
/var/run/gluster/5a92ed518f554fe96a3c3f4a1ecf5cb3.socket --brick-name
/rhs/brick5/5 -l /var/log/glusterfs/bricks/rhs-brick5-5.log --xlator-option
*-posix.glusterd-uuid=e4f737cd-59a2-4392-aa3d-4230f698f128 --brick-port 49157
--xlator-option 5-server.listen-port=49157

--- Additional comment from Worker Ant on 2017-05-16 07:05:13 EDT ---

REVIEW: https://review.gluster.org/17307 (glusterd: Don't spawn new glusterfsds
on node reboot with brick-mux) posted (#1) for review on master by Samikshan
Bairagya (samikshan at gmail.com)

--- Additional comment from Worker Ant on 2017-05-17 16:37:34 EDT ---

REVIEW: https://review.gluster.org/17307 (glusterd: Don't spawn new glusterfsds
on node reboot with brick-mux) posted (#2) for review on master by Samikshan
Bairagya (samikshan at gmail.com)

--- Additional comment from Worker Ant on 2017-05-18 07:56:46 EDT ---

REVIEW: https://review.gluster.org/17307 (glusterd: Don't spawn new glusterfsds
on node reboot with brick-mux) posted (#3) for review on master by Samikshan
Bairagya (samikshan at gmail.com)

--- Additional comment from Worker Ant on 2017-05-18 12:45:32 EDT ---

COMMIT: https://review.gluster.org/17307 committed in master by Jeff Darcy
(jeff at pl.atyp.us) 
------
commit 13e7b3b354a252ad4065f7b2f0f805c40a3c5d18
Author: Samikshan Bairagya <samikshan at gmail.com>
Date:   Tue May 16 15:07:21 2017 +0530

    glusterd: Don't spawn new glusterfsds on node reboot with brick-mux

    With brick multiplexing enabled, upon a node reboot new bricks were
    not being attached to the first spawned brick process even though
    there wasn't any compatibility issues.

    The reason for this is that upon glusterd restart after a node
    reboot, since brick services aren't running, glusterd starts the
    bricks in a "no-wait" mode. So after a brick process is spawned for
    the first brick, there isn't enough time for the corresponding pid
    file to get populated with a value before the compatibilty check is
    made for the next brick.

    This commit solves this by iteratively waiting for the pidfile to be
    populated in the brick compatibility comparison stage before checking
    if the brick process is alive.

    Change-Id: Ibd1f8e54c63e4bb04162143c9d70f09918a44aa4
    BUG: 1451248
    Signed-off-by: Samikshan Bairagya <samikshan at gmail.com>
    Reviewed-on: https://review.gluster.org/17307
    Reviewed-by: Atin Mukherjee <amukherj at redhat.com>
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1450889
[Bug 1450889] Brick Multiplexing: On reboot of a node Brick multiplexing
feature lost on that node as multiple brick processes get spawned
https://bugzilla.redhat.com/show_bug.cgi?id=1451248
[Bug 1451248] Brick Multiplexing: On reboot of a node Brick multiplexing
feature lost on that node as multiple brick processes get spawned
https://bugzilla.redhat.com/show_bug.cgi?id=1453086
[Bug 1453086] Brick Multiplexing: On reboot of a node Brick multiplexing
feature lost on that node as multiple brick processes get spawned
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list