[Bugs] [Bug 1408786] with granular-entry-self-heal enabled i see that there is a gfid mismatch and vm goes to paused state after migrating to another host

Thu Dec 29 15:09:13 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1408786


--- Comment #2 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: http://review.gluster.org/16293 committed in release-3.8 by Pranith
Kumar Karampuri (pkarampu at redhat.com) 
------
commit 8e2eaa6ea495e151adf1eca9cdd17f0a9f1a1bfc
Author: Krutika Dhananjay <kdhananj at redhat.com>
Date:   Mon Dec 26 21:08:03 2016 +0530

    cluster/afr: Fix missing name indices due to EEXIST error

            Backport of: http://review.gluster.org/16286

    PROBLEM:
    Consider a volume with  granular-entry-heal and sharding enabled. When
    a replica is down and a shard is created as part of a write, the name
    index is correctly created under indices/entry-changes/<dot-shard-gfid>.
    Now when a read on the same region triggers another MKNOD, the fop
    fails on the online bricks with EEXIST. By virtue of this being a
    symmetric error, the failed_subvols[] array is reset to all zeroes.
    Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be
    set, causing the name index, which was created in the previous MKNOD
    operation, to be wrongly deleted in THIS MKNOD operation.

    FIX:
    The ideal fix would have been for a transaction to delete the name
    index ONLY if it knows it is the one that created the index in the first
    place. This would involve gathering information as to whether THIS xattrop
    created the index from individual bricks, aggregating their responses and
    based on the various posisble combinations of responses, decide whether to
    delete the index or not. This is rather complex. Simpler fix would be
    for post-op to examine local->op_ret in the event of no failed_subvols
    to figure out whether to delete the name index or not. This can
occasionally
    lead to creation of stale name indices but they won't be affecting the IO
path
    or mess with pending changelogs in any way and self-heal in its crawl of
    "entry-changes" directory would take care to delete such indices.

    Change-Id: Icc642a987d1b6a5097562315aecf1263ed35ceb6
    BUG: 1408786
    Signed-off-by: Krutika Dhananjay <kdhananj at redhat.com>
    Reviewed-on: http://review.gluster.org/16293
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=qy6a9tEukl&a=cc_unsubscribe