[Gluster-users] POSIX locks and disconnections between clients and bricks

Wed Mar 27 01:48:07 UTC 2019

All,

Glusterfs cleans up POSIX locks held on an fd when the client/mount through
which those locks are held disconnects from bricks/server. This helps
Glusterfs to not run into a stale lock problem later (For eg., if
application unlocks while the connection was still down). However, this
means the lock is no longer exclusive as other applications/clients can
acquire the same lock. To communicate that locks are no longer valid, we
are planning to mark the fd (which has POSIX locks) bad on a disconnect so
that any future operations on that fd will fail, forcing the application to
re-open the fd and re-acquire locks it needs [1].

Note that with AFR/replicate in picture we can prevent errors to
application as long as Quorum number of children "never ever" lost
connection with bricks after locks have been acquired. I am using the term
"never ever" as locks are not healed back after re-connection and hence
first disconnect would've marked the fd bad and the fd remains so even
after re-connection happens. So, its not just Quorum number of children
"currently online", but Quorum number of children "never having
disconnected with bricks after locks are acquired".

However, this use case is not affected if the application don't acquire any
POSIX locks. So, I am interested in knowing
* whether your use cases use POSIX locks?
* Is it feasible for your application to re-open fds and re-acquire locks
on seeing EBADFD errors?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1689375#c7

regards,
Raghavendra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190327/fe3b1ea7/attachment.html>