[Bugs] [Bug 1191030] Use rcu to protect concurrent access to data structures in GlusterD

bugzilla at redhat.com bugzilla at redhat.com
Mon Mar 16 09:19:18 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1191030



--- Comment #27 from Anand Avati <aavati at redhat.com> ---
COMMIT: http://review.gluster.org/9695 committed in master by Krishnan
Parthasarathi (kparthas at redhat.com) 
------
commit c7785f78420c94220954eef538ed4698713ebcdb
Author: Kaushal M <kaushal at redhat.com>
Date:   Thu Jan 8 19:24:59 2015 +0530

    glusterd: Protect the peer list and peerinfos with RCU.

    The peer list and the peerinfo objects are now protected using RCU.
    Design patterns described in the Paul McKenney's RCU dissertation [1]
    (sections 5 and 6) have been used to convert existing non-RCU protected
    code to RCU protected code.

    Currently, we are only targetting guaranteeing the existence of the
    peerinfo objects, ie., we are only looking to protect deletes, not all
    updaters. We chose this, as protecting all updates is a much more
    complex task.

    The steps used to accomplish this are,

    1. Remove all long lived direct references to peerinfo objects (apart
    from the peerinfo list). This includes references in glusterd_peerctx_t
    (RPC), glusterd_friend_sm_event_t (friend state machine) and others.
    This way no one has a reference to deleted peerinfo object.

    2. Replace the direct references with indirect references, ie., use
    peer uuid and peer hostname as indirect references to the peerinfo
    object. Any reader or updater now uses the indirect references to get to
    the actual peerinfo object, using glusterd_peerinfo_find. Cases where a
    peerinfo cannot be found are handled gracefully.

    3. The readers get and use the peerinfo object only within a RCU read
    critical section. This prevents the object from being deleted/freed when
    in actual use.

    4. The deletion of a peerinfo object is done in a ordered manner
    (glusterd_peerinfo_destroy). The object is first removed from the
    peerinfo list using an atomic list remove, but the list head is not
    reset to allow existing list readers to complete correctly. We wait for
    readers to complete, before resetting the list head. This removes the
    object from the list completely. After this no new readers can get a
    reference to the object, and it can be freed.

    This change was developed on the git branch at [2]. This commit is a
    combination of the following commits on the development branch.
      d7999b9 Protect the glusterd_conf_t->peers_list with RCU.
      0da85c4 Synchronize before INITing peerinfo list head after removing
              from list.
      32ec28a Add missing rcu_read_unlock
      8fed0b8 Correctly exit read critical section once peer is found.
      63db857 Free peerctx only on rpc destruction
      56eff26 Cleanup style issues
      e5f38b0 Indirection for events and friend_sm
      3c84ac4 In __glusterd_probe_cbk goto unlock only if peer already
              exists
      141d855 Address review comments on 9695/1
      aaeefed Protection during peer updates
      6eda33d Revert "Synchronize before INITing peerinfo list head after
              removing from list."
      f69db96 Remove unneeded line
      b43d2ec Address review comments on 9695/4
      7781921 Address review comments on 9695/5
      eb6467b Add some missing semi-colons
      328a47f Remove synchronize_rcu from
              glusterd_friend_sm_transition_state
      186e429 Run part of glusterd_friend_remove in critical section
      55c0a2e Fix gluster (peer status/ pool list) with no peers
      93f8dcf Use call_rcu to free peerinfo
      c36178c Introduce composite struct, gd_rcu_head

    [1]: http://www.rdrop.com/~paulmck/RCU/RCUdissertation.2004.07.14e1.pdf
    [2]: https://github.com/kshlm/glusterfs/tree/urcu

    Change-Id: Ic1480e59c86d41d25a6a3d159aa3e11fbb3cbc7b
    BUG: 1191030
    Signed-off-by: Kaushal M <kaushal at redhat.com>
    Reviewed-on: http://review.gluster.org/9695
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Atin Mukherjee <amukherj at redhat.com>
    Reviewed-by: Anand Nekkunti <anekkunt at redhat.com>
    Reviewed-by: Krishnan Parthasarathi <kparthas at redhat.com>
    Tested-by: Krishnan Parthasarathi <kparthas at redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=5XIxZF4ZJp&a=cc_unsubscribe


More information about the Bugs mailing list