[Gluster-devel] Handling locks in NSR

Wed Mar 2 09:25:00 UTC 2016

On Wed, Mar 02, 2016 at 02:29:26PM +0530, Avra Sengupta wrote:
> On 03/02/2016 02:02 PM, Venky Shankar wrote:
> >On Wed, Mar 02, 2016 at 01:40:08PM +0530, Avra Sengupta wrote:
> >>Hi,
> >>
> >>All fops in NSR, follow a specific workflow as described in this UML(https://docs.google.com/presentation/d/1lxwox72n6ovfOwzmdlNCZBJ5vQcCaONvZva0aLWKUqk/edit?usp=sharing).
> >>However all locking fops will follow a slightly different workflow as
> >>described below. This is a first proposed draft for handling locks, and we
> >>would like to hear your concerns and queries regarding the same.
> >>
> >>1. On receiving the lock, the leader will Journal the lock himself, and then
> >>try to actually acquire the lock. At this point in time, if it fails to
> >>acquire the lock, then it will invalidate the journal entry, and return a
> >>-ve ack back to the client. However, if it is successful in acquiring the
> >>lock, it will mark the journal entry as complete, and forward the fop to the
> >>followers.
> >So, does a contending non-blocking lock operation check only on the leader
> >since the followers might have not yet ack'd an earlier lock operation?
> A non-blocking lock follows the same work flow, and thereby checks on the
> leader first. In this case, it would be blocked on the leader, till the
> leader releases the lock. Then it will follow the same workflow.

A non-blocking lock should ideally return EAGAIN if the region is already locked.
Checking just on the leader (posix/locks on the leader server stack) and returning
EAGAIN is kind of incomplete as the earlier lock request might not have been granted
(due to failure on followers).

or does it even matter if we return EAGAIN during the transient state?

We could block the lock on the leader until an earlier lock request is satisfied
(in which case return EAGAIN) or in case of failure try to satisfy the lock request.

> >
> >>2. The followers on receiving the fop, will journal it, and then try to
> >>actually acquire the lock. If it fails to acquire the lock, then it will
> >>invalidate the journal entry, and return a -ve ack back to the leader. If it
> >>is successful in acquiring the lock, it will mark the journal entry as
> >>complete,and send a +ve ack to the leader.
> >>
> >>3. The leader on receiving all acks, will perform a quorum check. If quorum
> >>meets, it will send a +ve ack to the client. If the quorum fails, it will
> >>send a rollback to the followers.
> >>
> >>4. The followers on receiving the rollback, will journal it first, and then
> >>release the acquired lock. It will update the rollback entry in the journal
> >>as complete and send an ack to the leader.
> >What happens if the rollback fails for whatever reason?
> The leader receives a -ve rollback ack, but there's little it can do about
> it. Depending on the failure, it will be resolved during reconciliation
> >
> >>5. The leader on receiving the rollback acks, will journal it's own
> >>rollback, and then release the acquired lock. It will update the rollback
> >>entry in the journal, and send a -ve ack to the client.
> >>
> >>Few things to be noted in the above workflow are:
> >>1. It will be a synchronous operation, across the replica volume.
> >>2. Reconciliation will take care of nodes who have missed out the locks.
> >>3. On a client disconnect, there will be a lock-timeout on whose expiration
> >>all locks held by that particular client will be released.
> >>
> >>Regards,
> >>Avra
> >>_______________________________________________
> >>Gluster-devel mailing list
> >>Gluster-devel at gluster.org
> >>http://www.gluster.org/mailman/listinfo/gluster-devel
>