[Gluster-devel] Handling locks in NSR

Wed Mar 2 10:10:54 UTC 2016

On 03/02/2016 02:55 PM, Venky Shankar wrote:
> On Wed, Mar 02, 2016 at 02:29:26PM +0530, Avra Sengupta wrote:
>> On 03/02/2016 02:02 PM, Venky Shankar wrote:
>>> On Wed, Mar 02, 2016 at 01:40:08PM +0530, Avra Sengupta wrote:
>>>> Hi,
>>>>
>>>> All fops in NSR, follow a specific workflow as described in this UML(https://docs.google.com/presentation/d/1lxwox72n6ovfOwzmdlNCZBJ5vQcCaONvZva0aLWKUqk/edit?usp=sharing).
>>>> However all locking fops will follow a slightly different workflow as
>>>> described below. This is a first proposed draft for handling locks, and we
>>>> would like to hear your concerns and queries regarding the same.
>>>>
>>>> 1. On receiving the lock, the leader will Journal the lock himself, and then
>>>> try to actually acquire the lock. At this point in time, if it fails to
>>>> acquire the lock, then it will invalidate the journal entry, and return a
>>>> -ve ack back to the client. However, if it is successful in acquiring the
>>>> lock, it will mark the journal entry as complete, and forward the fop to the
>>>> followers.
>>> So, does a contending non-blocking lock operation check only on the leader
>>> since the followers might have not yet ack'd an earlier lock operation?
>> A non-blocking lock follows the same work flow, and thereby checks on the
>> leader first. In this case, it would be blocked on the leader, till the
>> leader releases the lock. Then it will follow the same workflow.
> A non-blocking lock should ideally return EAGAIN if the region is already locked.
> Checking just on the leader (posix/locks on the leader server stack) and returning
> EAGAIN is kind of incomplete as the earlier lock request might not have been granted
> (due to failure on followers).
>
> or does it even matter if we return EAGAIN during the transient state?
>
> We could block the lock on the leader until an earlier lock request is satisfied
> (in which case return EAGAIN) or in case of failure try to satisfy the lock request.
That is what I said, it will be blocked on the leader till the leader 
releases the already held lock.
>
>>>> 2. The followers on receiving the fop, will journal it, and then try to
>>>> actually acquire the lock. If it fails to acquire the lock, then it will
>>>> invalidate the journal entry, and return a -ve ack back to the leader. If it
>>>> is successful in acquiring the lock, it will mark the journal entry as
>>>> complete,and send a +ve ack to the leader.
>>>>
>>>> 3. The leader on receiving all acks, will perform a quorum check. If quorum
>>>> meets, it will send a +ve ack to the client. If the quorum fails, it will
>>>> send a rollback to the followers.
>>>>
>>>> 4. The followers on receiving the rollback, will journal it first, and then
>>>> release the acquired lock. It will update the rollback entry in the journal
>>>> as complete and send an ack to the leader.
>>> What happens if the rollback fails for whatever reason?
>> The leader receives a -ve rollback ack, but there's little it can do about
>> it. Depending on the failure, it will be resolved during reconciliation
>>>> 5. The leader on receiving the rollback acks, will journal it's own
>>>> rollback, and then release the acquired lock. It will update the rollback
>>>> entry in the journal, and send a -ve ack to the client.
>>>>
>>>> Few things to be noted in the above workflow are:
>>>> 1. It will be a synchronous operation, across the replica volume.
Atin, I am not sure how AFR handles it.
>>>> 2. Reconciliation will take care of nodes who have missed out the locks.
>>>> 3. On a client disconnect, there will be a lock-timeout on whose expiration
>>>> all locks held by that particular client will be released.
>>>>
>>>> Regards,
>>>> Avra
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel