[Gluster-devel] Handling locks in NSR

Avra Sengupta asengupt at redhat.com
Wed Mar 2 10:43:11 UTC 2016


On 03/02/2016 04:03 PM, Atin Mukherjee wrote:
>
> -Atin
> Sent from one plus one
> On 02-Mar-2016 3:41 pm, "Avra Sengupta" <asengupt at redhat.com 
> <mailto:asengupt at redhat.com>> wrote:
> >
> > On 03/02/2016 02:55 PM, Venky Shankar wrote:
> >>
> >> On Wed, Mar 02, 2016 at 02:29:26PM +0530, Avra Sengupta wrote:
> >>>
> >>> On 03/02/2016 02:02 PM, Venky Shankar wrote:
> >>>>
> >>>> On Wed, Mar 02, 2016 at 01:40:08PM +0530, Avra Sengupta wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> All fops in NSR, follow a specific workflow as described in this 
> UML(https://docs.google.com/presentation/d/1lxwox72n6ovfOwzmdlNCZBJ5vQcCaONvZva0aLWKUqk/edit?usp=sharing).
> >>>>> However all locking fops will follow a slightly different 
> workflow as
> >>>>> described below. This is a first proposed draft for handling 
> locks, and we
> >>>>> would like to hear your concerns and queries regarding the same.
> >>>>>
> >>>>> 1. On receiving the lock, the leader will Journal the lock 
> himself, and then
> >>>>> try to actually acquire the lock. At this point in time, if it 
> fails to
> >>>>> acquire the lock, then it will invalidate the journal entry, and 
> return a
> >>>>> -ve ack back to the client. However, if it is successful in 
> acquiring the
> >>>>> lock, it will mark the journal entry as complete, and forward 
> the fop to the
> >>>>> followers.
> >>>>
> >>>> So, does a contending non-blocking lock operation check only on 
> the leader
> >>>> since the followers might have not yet ack'd an earlier lock 
> operation?
> >>>
> >>> A non-blocking lock follows the same work flow, and thereby checks 
> on the
> >>> leader first. In this case, it would be blocked on the leader, 
> till the
> >>> leader releases the lock. Then it will follow the same workflow.
> >>
> >> A non-blocking lock should ideally return EAGAIN if the region is 
> already locked.
> >> Checking just on the leader (posix/locks on the leader server 
> stack) and returning
> >> EAGAIN is kind of incomplete as the earlier lock request might not 
> have been granted
> >> (due to failure on followers).
> >>
> >> or does it even matter if we return EAGAIN during the transient state?
> >>
> >> We could block the lock on the leader until an earlier lock request 
> is satisfied
> >> (in which case return EAGAIN) or in case of failure try to satisfy 
> the lock request.
> >
> > That is what I said, it will be blocked on the leader till the 
> leader releases the already held lock.
> >
> >>
> >>>>> 2. The followers on receiving the fop, will journal it, and then 
> try to
> >>>>> actually acquire the lock. If it fails to acquire the lock, then 
> it will
> >>>>> invalidate the journal entry, and return a -ve ack back to the 
> leader. If it
> >>>>> is successful in acquiring the lock, it will mark the journal 
> entry as
> >>>>> complete,and send a +ve ack to the leader.
> >>>>>
> >>>>> 3. The leader on receiving all acks, will perform a quorum 
> check. If quorum
> >>>>> meets, it will send a +ve ack to the client. If the quorum 
> fails, it will
> >>>>> send a rollback to the followers.
> >>>>>
> >>>>> 4. The followers on receiving the rollback, will journal it 
> first, and then
> >>>>> release the acquired lock. It will update the rollback entry in 
> the journal
> >>>>> as complete and send an ack to the leader.
> >>>>
> >>>> What happens if the rollback fails for whatever reason?
> >>>
> >>> The leader receives a -ve rollback ack, but there's little it can 
> do about
> >>> it. Depending on the failure, it will be resolved during 
> reconciliation
> >>>>>
> >>>>> 5. The leader on receiving the rollback acks, will journal it's own
> >>>>> rollback, and then release the acquired lock. It will update the 
> rollback
> >>>>> entry in the journal, and send a -ve ack to the client.
> >>>>>
> >>>>> Few things to be noted in the above workflow are:
> >>>>> 1. It will be a synchronous operation, across the replica volume.
> >
> > Atin, I am not sure how AFR handles it.
> If AFR/EC handle them asynchronously do you see any performance 
> bottleneck with NSR for this case?
>
Well it's not synchronous to the point that the follwers would perform 
it one after the other. AFR/EC clients would also have to wait for acks 
from a quorum of servers till they can ack the client. The same is true 
with the NSR leader, who will have to wait till it gets ack from a 
quorum of followers.
>
> >
> >>>>> 2. Reconciliation will take care of nodes who have missed out 
> the locks.
> >>>>> 3. On a client disconnect, there will be a lock-timeout on whose 
> expiration
> >>>>> all locks held by that particular client will be released.
> >>>>>
> >>>>> Regards,
> >>>>> Avra
> >>>>> _______________________________________________
> >>>>> Gluster-devel mailing list
> >>>>> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
> >>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160302/0c7ea1dd/attachment.html>


More information about the Gluster-devel mailing list