[Bugs] [Bug 1142614] files with open fd's getting into split-brain when bricks goes offline and comes back online

bugzilla at redhat.com bugzilla at redhat.com
Wed Oct 1 07:14:31 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1142614



--- Comment #3 from Anand Avati <aavati at redhat.com> ---
COMMIT: http://review.gluster.org/8757 committed in release-3.5 by Niels de Vos
(ndevos at redhat.com) 
------
commit bee0c740b54669a8be11acea405d021bb50d3c54
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Wed Sep 17 11:48:24 2014 +0530

    cluster/afr: Launch self-heal only when all the brick status is known

    Problem:
    File goes into split-brain because of wrong erasing of xattrs.

    RCA:
    The issue happens because index self-heal is triggered even before all the
    bricks are up. So what ends up happening while erasing the xattrs is,
xattrs
    are erased only on the sink brick for the brick that it thinks is up
leading to
    split-brain

    Example:
    lets say the xattrs before heal started are:
    brick 2:
    trusted.afr.vol1-client-2=0x000000020000000000000000
    trusted.afr.vol1-client-3=0x000000020000000000000000

    brick 3:
    trusted.afr.vol1-client-2=0x000010040000000000000000
    trusted.afr.vol1-client-3=0x000000000000000000000000

    if only brick-2 came up at the time of triggering the self-heal only
    'trusted.afr.vol1-client-2' is erased leading to the following xattrs:

    brick 2:
    trusted.afr.vol1-client-2=0x000000000000000000000000
    trusted.afr.vol1-client-3=0x000000020000000000000000

    brick 3:
    trusted.afr.vol1-client-2=0x000010040000000000000000
    trusted.afr.vol1-client-3=0x000000000000000000000000

    So the file goes into split-brain.

    Change-Id: I79f9a289d2118a715d262398221037b684a53d2a
    BUG: 1142614
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/8757
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Krutika Dhananjay <kdhananj at redhat.com>
    Reviewed-by: Niels de Vos <ndevos at redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=B8bbi0a0ih&a=cc_unsubscribe


More information about the Bugs mailing list