<div dir="ltr"><div>All,<br><br>Glusterfs cleans up POSIX locks held on an fd when the client/mount
through which those locks are held disconnects from bricks/server. This
helps Glusterfs to not run into a stale lock problem later (For eg., if
application unlocks while the connection was still down). However, this
means the lock is no longer exclusive as other applications/clients can
acquire the same lock. To communicate that locks are no longer valid, we
are planning to mark the fd (which has POSIX locks) bad on a disconnect
so that any future operations on that fd will fail, forcing the
application to re-open the fd and re-acquire locks it needs [1].<br><br>Note that with AFR/replicate in picture we can prevent errors to application as long as Quorum number of children "never ever" lost connection with bricks after locks have been acquired. I am using the term "never ever" as locks are not healed back after re-connection and hence first disconnect would've marked the fd bad and the fd remains so even after re-connection happens. So, its not just Quorum number of children "currently online", but Quorum number of children "never having disconnected with bricks after locks are acquired".<br><br>However, this use case is not affected if the application don't
acquire any POSIX locks. So, I am interested in knowing <br>* whether your use cases use POSIX locks?<br></div>* Is it feasible for your application to re-open fds and re-acquire locks on seeing EBADFD errors?<br><div><br>[1] <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1689375#c7" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1689375#c7</a><br><div><br></div><div>regards,<br></div><div>Raghavendra<div class="gmail-adL"><br></div></div></div></div>