[Bugs] [Bug 1191537] With afrv2 + ext4, lookups on directories with large offsets could result in duplicate/missing entries

bugzilla at redhat.com bugzilla at redhat.com
Wed Mar 4 07:38:47 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1191537



--- Comment #2 from Anand Avati <aavati at redhat.com> ---
COMMIT: http://review.gluster.org/9638 committed in release-3.6 by Raghavendra
Bhat (raghavendra at redhat.com) 
------
commit f396e475417aa52daf49e4564c67628cc8f0e598
Author: Anand Avati <avati at redhat.com>
Date:   Tue Dec 23 10:04:00 2014 -0800

    afr: stop encoding subvolume id in readdir d_off

            Backport of http://review.gluster.org/9332

    The purpose of encoding d_off in AFR is to indicate the
    selected subvolume for the first readdir, and continue all
    further readdirs of the session on the same subvolume. This is
    required because, unlike files, dir d_offs are specific to the
    backend and cannot be re-used on another subvolume. The d_off
    transformation encodes the subvolume id and prevents such
    invalid use of d_offs on other servers.

    However, this approach could be quite wasteful of precious d_off
    bit-space. Unlike DHT, where server id can change from entry to
    entry and thus encoding the server id in the transformed d_off
    is necessary, we could take a slightly relaxed approach in AFR.
    The approach is to save the subvolume where the last readdir
    request was sent in the fd_ctx. This consumes constant space (i.e
    no per-entry cache), and serves the purpose of avoiding d_off
    "misuse" (i.e using d_off from one server on another).

    The compromise here is NFS resuming readdir from a non-0 cookie
    after an extended delay (either anonymous FD has been reclaimed,
    or server has restarted). In such cases a subvolume is picked
    freshly. To make this fresh picking more deterministic (i.e, to
    pick the same subvolume whenever possible, even after reboots),
    the function afr_hash_child (used by afr_read_subvol_select_by_policy)
    is modified to skip all dynamic inputs (i.e PID) for the case
    of directories.

    BUG: 1191537
    Change-Id: I7e3bd8dfe346a9a8e428d7ddeada6cfb66e64e54
    Signed-off-by: Anand Avati <avati at redhat.com>
    Reviewed-on: http://review.gluster.org/9638
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Raghavendra Bhat <raghavendra at redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list