[Bugs] [Bug 1142612] files with open fd's getting into split-brain when bricks goes offline and comes back online

bugzilla at redhat.com bugzilla at redhat.com
Tue Sep 30 06:30:35 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1142612



--- Comment #3 from Anand Avati <aavati at redhat.com> ---
COMMIT: http://review.gluster.org/8756 committed in release-3.6 by Vijay Bellur
(vbellur at redhat.com) 
------
commit 86b4c0319d4275859575720eced3200583942cfb
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Wed Sep 17 11:33:23 2014 +0530

    cluster/afr: Launch self-heal only when all the brick status is known

    Problem:
    File goes into split-brain because of wrong erasing of xattrs.

    RCA:
    The issue happens because index self-heal is triggered even before all the
    bricks are up. So what ends up happening while erasing the xattrs is,
xattrs
    are erased only on the sink brick for the brick that it thinks is up
leading to
    split-brain

    Example:
    lets say the xattrs before heal started are:
    brick 2:
    trusted.afr.vol1-client-2=0x000000020000000000000000
    trusted.afr.vol1-client-3=0x000000020000000000000000

    brick 3:
    trusted.afr.vol1-client-2=0x000010040000000000000000
    trusted.afr.vol1-client-3=0x000000000000000000000000

    if only brick-2 came up at the time of triggering the self-heal only
    'trusted.afr.vol1-client-2' is erased leading to the following xattrs:

    brick 2:
    trusted.afr.vol1-client-2=0x000000000000000000000000
    trusted.afr.vol1-client-3=0x000000020000000000000000

    brick 3:
    trusted.afr.vol1-client-2=0x000010040000000000000000
    trusted.afr.vol1-client-3=0x000000000000000000000000

    So the file goes into split-brain.

    BUG: 1142612
    Change-Id: I0c8b66e154f03b636db052c97745399a7cca265b
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/8756
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Krutika Dhananjay <kdhananj at redhat.com>
    Reviewed-by: Vijay Bellur <vbellur at redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=G5hm7nQzat&a=cc_unsubscribe


More information about the Bugs mailing list