[Gluster-devel] Support to reclaim locks (posix) provided lkowner & range matches

Soumya Koduri skoduri at redhat.com
Wed Aug 10 11:58:23 UTC 2016


We (CC'ed) had a brief discussion on use-cases and semantics of the lock 
reclamation support. Few of the changes suggested to the existing 
proposal are -

1) Allow reclamation of locks only if there is an existing lock present 
on the file with the same owner. The reason it is needed is to not allow 
applications misuse this feature and to maintain data-integrity. That is 
in case if the lock is already cleaned up or if there is a lock owned by 
another client, since server cannot guarantee the file state, it should 
reject reclamation request and let the application know that previous 
lock no longer exists.

2) With (1) above, we shall need the server to hold on to the locks for 
a longer time instead of cleaning them up immediately as soon as 
disconnect event happens with older client. This could be achieved by 
enabling grace_timer support (as mentioned by Vijay earlier) on the 
server-side.

I have updated the feature-spec[1] with the details. Comments are welcome.

Thanks,
Soumya

[1] http://review.gluster.org/#/c/15053/3/under_review/reclaim-locks.md

On 07/28/2016 07:29 PM, Soumya Koduri wrote:
>
>
> On 07/27/2016 02:38 AM, Vijay Bellur wrote:
>>
>> On 07/26/2016 05:56 AM, Soumya Koduri wrote:
>>> Hi Vijay,
>>>
>>> On 07/26/2016 12:13 AM, Vijay Bellur wrote:
>>>> On 07/22/2016 08:44 AM, Soumya Koduri wrote:
>>>>> Hi,
>>>>>
>>>>> In certain scenarios (esp.,in highly available environments), the
>>>>> application may have to fail-over/connect to a different glusterFS
>>>>> client while the I/O is happening. In such cases until there is a ping
>>>>> timer expiry and glusterFS server cleans up the locks held by the
>>>>> older
>>>>> glusterFS client, the application will not be able to reclaim their
>>>>> lost
>>>>> locks. To avoid that we need support in Gluster to let clients reclaim
>>>>> the existing locks provided lkwoner and the lock range matches.
>>>>
>>>>
>>>> If the server detects a disconnection, it goes about cleaning up the
>>>> locks held by the disconnected client. Only if the failover connection
>>>> happens before this server cleanup the outlined scheme would work.Since
>>>> there is no ping timer on the server, do you propose to have a grace
>>>> timer on the server?
>>>
>>> But we are looking for a solution which can work in active-active
>>> configuration as well. We need to handle cases where in the connection
>>> between server and the old-client is still in use, which can happen
>>> during load-balancing or failback.
>>>
>>> Different cases which I can outline are:
>>>
>>> Application Client - (AC)
>>> Application/GlusterClient 1 - GC1
>>> Application/GlusterClient 2 - GC2
>>> Gluster Server (GS)
>>>
>>> 1) Active-Passive config  (service gone down)
>>>
>>> AC ----> GC1  ----> GS (GC2 is not active)
>>>
>>>     | (failover)
>>>     v
>>>
>>> AC ----> GC2  ----> GS (GC1 connection gets dropped and GC2 establishes
>>> connection)
>>>
>>> In this case, we can have grace timer to allow reclaims only for certain
>>> time post GC2 (any) rpc connection establishment.
>>>
>>> 2) Active-Active config  (service gone down)
>>>
>>> AC ----> GC1  ----> GS
>>>              ^
>>>              |
>>>          GC2  -------
>>>
>>>     | (failover)
>>>     v
>>>
>>> AC ----> GC2  ----> GS (GC1 connection gets dropped)
>>>
>>> The grace timer then shall not get triggered in this case. But at-least
>>> the locks from GC1 gets cleaned post its connection cleanup.
>>>
>>
>> grace timer is not required if lock reclamation can happen before the
>> old connection between GC1 & GS gets dropped. Is this guaranteed to
>> happen every time?
>
> Not all the time but more likely since failover time is usually lesser
> than ping timer / rpc connection expiry time.
>
>>
>>>
>>> 3) Active-Active config  (both the services active/load-balancing)
>>> This is the trick one.
>>>
>>> AC ----> GC1  ----> GS
>>>              ^
>>>              |
>>>          GC2  -------
>>>
>>>     | (load-balancing/failback)
>>>     v
>>>
>>>      GC1  ----> GS
>>>              ^
>>>              |
>>> AC ----> GC2  -------
>>>
>>> The locks taken by GC1 shall end up being on the server for ever unless
>>> we restart either GC1 or the server.
>>>
>>
>> Yes, this is trickier. The behavior is dependent on how the application
>> performs a failback. How do we handle this with Ganesha today? Since the
>> connection between nfs client and Ganesha/GC1 is broken, would it not
>> send cleanup requests on locks it held on behalf of that client?
>>
> Yes. I checked within NFS-Ganesha community too. There seems to be a
> provision in NFS-Ganesha to trigger an event upon receiving which it can
> flush the locks associated with an IP. We could send this event to the
> active servers (in this case GC1) while triggering fail-back. So from
> NFS-Ganesha perspective, this seems to be taken care of. Unless some
> other application (SMB3 handles?) has this use-case, we may for now can
> ignore it.
>
>>
>>> Considering above cases, looks like we may need to allow reclaim of the
>>> locks all the time. Please suggest if I have missed out any details.
>>>
>>
>> I agree that lock reclamation is needed. Grace timeout behavior does
>> need more thought for all these cases. Given the involved nature of this
>> problem, it might be better to write down a more detailed spec that
>> discusses all these cases for a more thorough review.
>
> Sure. I will open up a spec.
>
> Thanks,
> Soumya
>
>>
>>>>
>>>>>
>>>>> For client-side support, I am thinking if we can integrate with the
>>>>> new
>>>>> lock API being introduced as part of mandatory lock support in gfapi
>>>>> [2]
>>>>>
>>>>
>>>> Is glfs_file_lock() planned to be used here? If so, how do we specify
>>>> that it is a reclaim lock in this api?
>>>
>>> Yes. We have been discussing on that patch-set if we can use the same
>>> API. We should either have a separate field to pass reclaim flag or if
>>> we choose not to change its definition, then probably can have
>>> additional lock types -
>>>
>>> GLFS_LK_ADVISORY
>>> GLFS_LK_MANDATORY
>>>
>>> New lock-types
>>> GLFS_LK_RECLAIM_ADVISORY
>>> GLFS_LK_RECLAIM_MANDATORY
>>>
>>
>> Either approach seems reasonable to me.
>>
>>>>
>>>> We also would need to pass the reclaim_lock flag over rpc.
>>>
>>> To avoid new fop/rpc changes, I was considering to take xdata approach
>>> (similar to the way lock mode is passed in xdata for mandatory lock
>>> support) since the processing of reclamation doesn't differ much from
>>> the existing lk fop except for conflicting lock checks.
>>>
>>
>> This looks ok to me.
>>
>> Thanks,
>> Vijay
>>
>>
>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list