[Bugs] [Bug 1330855] A replicated volume takes too much to come online when one server is down

Wed Apr 27 08:40:48 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1330855


--- Comment #2 from Vijay Bellur <vbellur at redhat.com> ---
COMMIT: http://review.gluster.org/14088 committed in release-3.7 by Pranith
Kumar Karampuri (pkarampu at redhat.com) 
------
commit 17ddeb5cfed4029db65d6432511ddff28c866129
Author: Ravishankar N <ravishankar at redhat.com>
Date:   Wed Dec 23 13:49:14 2015 +0530

    afr: propagate child up event after timeout

    Backport of: http://review.gluster.org/11113

    Problem: During mount, afr waits for response from all its children before
    notifying the parent xlator. In a 1x2 replica volume , if one of the nodes
is
    down, the mount will hang for more than a minute until child down is
received
    from the client xlator for that node.

    Fix:
    When parent up is received by afr, start a 10 second timer. In the timer
call
    back, if we receive a successful child up from atleast one brick, propagate
the
    event to the parent xlator.

    Change-Id: I31e57c8802c1a03a4a5d581ee4ab82f3a9c8799d
    BUG: 1330855
    Signed-off-by: Ravishankar N <ravishankar at redhat.com>
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/14088
    Smoke: Gluster Build System <jenkins at build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=buQvrRU0A4&a=cc_unsubscribe