[Bugs] [Bug 1473141] cluster/dht: Fix hardlink migration failures

Fri Aug 11 20:03:49 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1473141


--- Comment #4 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: https://review.gluster.org/17838 committed in release-3.10 by
Shyamsundar Ranganathan (srangana at redhat.com) 
------
commit e0cd91f14eebee77c8ed332cedfd25547daa01d7
Author: Susant Palai <spalai at redhat.com>
Date:   Wed Jul 12 12:01:40 2017 +0530

    cluster/rebalance: Fix hardlink migration failures

    A brief about how hardlink migration works:
      - Different hardlinks (to the same file) may hash to different bricks,
    but their cached subvol will be same. Rebalance picks up the first
hardlink,
    calculates it's  hash(call it TARGET) and set the hashed subvolume as an
xattr
    on the data file.
      - Now all the hardlinks those come after this will fetch that xattr and
will
    create linkto files on TARGET (all linkto files for the hardlinks will be
hardlink
    to each other on TARGET).
      - When number of hardlinks on source is equal to the number of hardlinks
on
    TARGET, the data migration will happen.

    RACE:1
      Since rebalance is multi-threaded, the first lookup (which decides where
the TARGET
    subvol should be), can be called by two hardlink migration parallely and
they may end
    up creating linkto files on two different TARGET subvols. Hence, hardlinks
won't be
    migrated.

    Fix: Rely on the xattr response of lookup inside gf_defrag_handle_hardlink
since it
    is executed under synclock.

    RACE:2
      The linkto files on TARGET can be created by other clients also if they
are doing
    lookup on the hardlinks.  Consider a scenario where you have 100 hardlinks.
 When
    rebalance is migrating 99th hardlink, as a result of continuous lookups
from other
    client, linkcount on TARGET is equal to source linkcount. Rebalance will
migrate data
    on the 99th hardlink itself. On 100th hardlink migration, hardlink will
have TARGET as
    cached subvolume. If it's hash is also the same, then a migration will be
triggered from
    TARGET to TARGET leading to data loss.

    Fix: Make sure before the final data migration, source is not same as
destination.

    RACE:3
      Since a hardlink can be migrating to a non-hashed subvolume, a lookup
from other
    client or even the rebalance it self, might delete the linkto file on
TARGET leading
    to hardlinks never getting migrated.

    This will be addressed in a different patch in future.

    > Change-Id: If0f6852f0e662384ee3875a2ac9d19ac4a6cea98
    > BUG: 1469964
    > Signed-off-by: Susant Palai <spalai at redhat.com>
    > Reviewed-on: https://review.gluster.org/17755
    > Smoke: Gluster Build System <jenkins at build.gluster.org>
    > CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    > Reviewed-by: N Balachandran <nbalacha at redhat.com>
    > Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
    > Signed-off-by: Susant Palai <spalai at redhat.com>

    Change-Id: If0f6852f0e662384ee3875a2ac9d19ac4a6cea98
    BUG: 1473141
    Signed-off-by: Susant Palai <spalai at redhat.com>
    Reviewed-on: https://review.gluster.org/17838
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana at redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=x1lCFzFu3P&a=cc_unsubscribe