[Gluster-devel] Regarding doing away with refkeeper in locks xlator

Fri Jun 6 06:08:24 UTC 2014

On 06/06/2014 11:37 AM, Anand Avati wrote:
>
>
>
> On Thu, Jun 5, 2014 at 10:56 PM, Pranith Kumar Karampuri 
> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>
>
>     On 06/06/2014 10:47 AM, Pranith Kumar Karampuri wrote:
>>
>>     On 06/06/2014 10:43 AM, Anand Avati wrote:
>>>
>>>
>>>
>>>     On Thu, Jun 5, 2014 at 9:54 PM, Pranith Kumar Karampuri
>>>     <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>>
>>>
>>>         On 06/06/2014 10:02 AM, Anand Avati wrote:
>>>>         On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri
>>>>         <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>>>
>>>>>
>>>>>             This sounds a bit complicated. I think there is a much
>>>>>             simpler solution:
>>>>>
>>>>>             - First, make update_refkeeper() check for blocked
>>>>>             locks (which I mentioned as "optional" previously)
>>>>>
>>>>>             - Make grant_blocked_locks() double up and do the job
>>>>>             of update_refkeeper() internally.
>>>>>
>>>>>             Something which looks like this:
>>>>>
>>>>>             diff --git a/xlators/features/locks/src/common.c
>>>>>             b/xlators/features/locks/src/common.c
>>>>>             index f6c71c1..38df385 100644
>>>>>             --- a/xlators/features/locks/src/common.c
>>>>>             +++ b/xlators/features/locks/src/common.c
>>>>>             @@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t
>>>>>             *pl_inode)
>>>>>                    if (!list_empty (&dom->entrylk_list))
>>>>>              is_empty = 0;
>>>>>             +        if (!list_empty (&dom->blocked_entrylks))
>>>>>             +  is_empty = 0;
>>>>>             +
>>>>>                    if (!list_empty (&dom->inodelk_list))
>>>>>              is_empty = 0;
>>>>>             +
>>>>>             +        if (!list_empty (&dom->blocked_inodelks))
>>>>>             +  is_empty = 0;
>>>>>              }
>>>>>              return is_empty;
>>>>>             @@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t
>>>>>             *this, pl_inode_t *pl_inode)
>>>>>              struct list_head granted_list;
>>>>>              posix_lock_t     *tmp = NULL;
>>>>>              posix_lock_t     *lock = NULL;
>>>>>             + inode_t *unref = NULL;
>>>>>              INIT_LIST_HEAD (&granted_list);
>>>>>              pthread_mutex_lock (&pl_inode->mutex);
>>>>>              {
>>>>>              __grant_blocked_locks (this, pl_inode, &granted_list);
>>>>>             +
>>>>>             +       if (__pl_inode_is_empty (pl_inode) &&
>>>>>             pl_inode->refkeeper) {
>>>>>             +   unref = pl_inode->refkeeper;
>>>>>             + pl_inode->refkeeper = NULL;
>>>>>             +       }
>>>>>              }
>>>>>              pthread_mutex_unlock (&pl_inode->mutex);
>>>>>             @@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t
>>>>>             *this, pl_inode_t *pl_inode)
>>>>>                    GF_FREE (lock);
>>>>>              }
>>>>>             + if (unref)
>>>>>             + inode_unref (unref);
>>>>>             +
>>>>>              return;
>>>>>              }
>>>>>
>>>>>
>>>>>             This should make pl_disconnect_cbk() pretty much race
>>>>>             free w.r.t refkpeer. Thoughts?
>>>>             Lets say C1 is doing pl_inodelk_client_cleanup. After
>>>>             the second for-loop(All granted and blocked locks are
>>>>             out of the domain) if an unlock on the final granted
>>>>             lock on that inode from client C2 completes, refkeeper
>>>>             would be set to NULL and unrefed leading to zero refs
>>>>             on that inode i.e. pl_forget will also happen. In 3rd
>>>>             for-loop pl_inode is already freed and leads to free'd
>>>>             memory access and will crash.
>>>>
>>>>
>>>>
>>>>         We also need:
>>>>
>>>>         diff --git a/xlators/features/locks/src/inodelk.c b/xlators/features/locks/src/inodelk.c
>>>>         index c76cb7f..2aceb8a 100644
>>>>         --- a/xlators/features/locks/src/inodelk.c
>>>>         +++ b/xlators/features/locks/src/inodelk.c
>>>>         @@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, pl_ctx_t *ctx)
>>>>           
>>>>                          dom = get_domain (pl_inode, l->volume);
>>>>           
>>>>         -               grant_blocked_inode_locks (this, pl_inode, dom);
>>>>         -
>>>>                          pthread_mutex_lock (&pl_inode->mutex);
>>>>                          {
>>>>                                  __pl_inodelk_unref (l);
>>>>                          }
>>>>                          pthread_mutex_unlock (&pl_inode->mutex);
>>>>         +
>>>>         +               grant_blocked_inode_locks (this, pl_inode, dom);
>>>>                   }
>>>>            
>>>>                return 0;
>>>>
>>>>         Missed this in the last patch.
>>>         It still doesn't solve the problem I described earlier. By
>>>         the time it executes this third loop refkeeper is already
>>>         unreffed when C2 unlocks.
>>>
>>>
>>>
>>>     Right, we will need to introduce an "in_cleanup" counter, if set
>>>     pl_update_refkeeper() should not unref. Increment the
>>>     in_cleanup() in the first lookup, and decrement it in the last
>>>     loop, just before calling grant_blocked_locks() (along with the
>>>     patches in my last 2 mails)
>>     s/first lookup/first loop/ ?
>     Consider the following scenario:
>     There are two granted locks L1, L2 from C1, C2 clients
>     respectively on same inode.
>     C1 gets disconnected.
>     C2 issues a unlock.
>
>     This is the sequence of steps:
>     1) C1 executes first loop, increments in_cleanup to 1
>     2) C2 executes pl_inode_setlk and removed L2 from granted list. It
>     is now just before grant_blocked_inode_locks()
>     3) C1 starts 3rd for loop and unrefs L1, decrements in_cleanup to 0
>     4) C2 executes grant_blocked_inode_locks() and decrements the
>     refkeepr, sets it to NULL and unwinds. This destroys the inode so
>     pl_inode is freed.
>     5) C1 calls grant_blocked_inode_locks with pl_inode which is free'd
>
>
> Yeah, we need a version of grant_blocked_inode_locks() which 
> decrements in_cleanup in its locked region.
I was just thinking the same. I will update you if it works.

Pranith

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140606/aa00c321/attachment-0001.html>