[Bugs] [Bug 1457981] client fails to connect to the brick due to an incorrect port reported back by glusterd

bugzilla at redhat.com bugzilla at redhat.com
Tue Jun 6 02:24:31 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1457981



--- Comment #5 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: https://review.gluster.org/17447 committed in master by Jeff Darcy
(jeff at pl.atyp.us) 
------
commit 7b58ec260152bdcf840ac622dbb883ce8b593f63
Author: Atin Mukherjee <amukherj at redhat.com>
Date:   Thu Jun 1 22:05:51 2017 +0530

    glusterd: fix brick start race

    This commit tries to handle a race where we might end up trying to spawn
    the brick process twice with two different set of ports resulting into
    glusterd portmapper having the same brick entry in two different ports
    which will result into clients to fail connect to bricks because of
    incorrect ports been communicated back by glusterd.

    In glusterd_brick_start () checking brickinfo->status flag to identify
    whether a brick has been started by glusterd or not is not sufficient as
    there might be cases where while glusterd restarts
    glusterd_restart_bricks () will be called through glusterd_spawn_daemons
    () in synctask and immediately glusterd_do_volume_quorum_action () with
    server-side-quorum set to on will again try to start the brick and in
    case if the RPC_CLNT_CONNECT event for the same brick  hasn't been
processed by
    glusterd by that time, brickinfo->status will still be marked as
    GF_BRICK_STOPPED resulting into a reattempt to start the brick with a
different
    port and that would result portmap go for a toss and resulting clients to
fetch
    incorrect port.

    Fix would be to introduce another enum value called GF_BRICK_STARTING in
    brickinfo->status which will be set when a brick start is attempted by
    glusterd and will be set to started through RPC_CLNT_CONNECT event. For
    brick multiplexing, on attach brick request given the brickinfo->status
    flag is marked to started directly this value will not have any effect.
    Also this patch removes started_here flag as it looks to be redundant as
    brickinfo->status.

    Change-Id: I9dda1a9a531b67734a6e8c7619677867b520dcb2
    BUG: 1457981
    Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
    Reviewed-on: https://review.gluster.org/17447
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Jeff Darcy <jeff at pl.atyp.us>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=yAOwdt4x26&a=cc_unsubscribe


More information about the Bugs mailing list