[Gluster-users] POSIX locks and disconnections between clients and bricks
Soumya Koduri
skoduri at redhat.com
Wed Mar 27 09:53:35 UTC 2019
On 3/27/19 12:55 PM, Xavi Hernandez wrote:
> Hi Raghavendra,
>
> On Wed, Mar 27, 2019 at 2:49 AM Raghavendra Gowdappa
> <rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>> wrote:
>
> All,
>
> Glusterfs cleans up POSIX locks held on an fd when the client/mount
> through which those locks are held disconnects from bricks/server.
> This helps Glusterfs to not run into a stale lock problem later (For
> eg., if application unlocks while the connection was still down).
> However, this means the lock is no longer exclusive as other
> applications/clients can acquire the same lock. To communicate that
> locks are no longer valid, we are planning to mark the fd (which has
> POSIX locks) bad on a disconnect so that any future operations on
> that fd will fail, forcing the application to re-open the fd and
> re-acquire locks it needs [1].
>
>
> Wouldn't it be better to retake the locks when the brick is reconnected
> if the lock is still in use ?
>
> BTW, the referenced bug is not public. Should we open another bug to
> track this ?
>
>
> Note that with AFR/replicate in picture we can prevent errors to
> application as long as Quorum number of children "never ever" lost
> connection with bricks after locks have been acquired. I am using
> the term "never ever" as locks are not healed back after
> re-connection and hence first disconnect would've marked the fd bad
> and the fd remains so even after re-connection happens. So, its not
> just Quorum number of children "currently online", but Quorum number
> of children "never having disconnected with bricks after locks are
> acquired".
>
>
> I think this requisite is not feasible. In a distributed file system,
> sooner or later all bricks will be disconnected. It could be because of
> failures or because an upgrade is done, but it will happen.
>
> The difference here is how long are fd's kept open. If applications open
> and close files frequently enough (i.e. the fd is not kept open more
> time than it takes to have more than Quorum bricks disconnected) then
> there's no problem. The problem can only appear on applications that
> open files for a long time and also use posix locks. In this case, the
> only good solution I see is to retake the locks on brick reconnection.
>
>
> However, this use case is not affected if the application don't
> acquire any POSIX locks. So, I am interested in knowing
> * whether your use cases use POSIX locks?
> * Is it feasible for your application to re-open fds and re-acquire
> locks on seeing EBADFD errors?
>
>
> I think that many applications are not prepared to handle that.
+1 to all the points mentioned by Xavi. This has been day-1 issue for
all the applications using locks (like NFS-Ganesha and Samba). Not many
applications re-open and re-acquire the locks. On receiving EBADFD, that
error is most likely propagated to application clients.
Agree with Xavi that its better to heal/re-acquire the locks on brick
reconnects before it accepts any fresh requests. I also suggest to have
this healing mechanism generic enough (if possible) to heal any
server-side state (like upcall, leases etc).
Thanks,
Soumya
>
> Xavi
>
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1689375#c7
>
> regards,
> Raghavendra
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
More information about the Gluster-users
mailing list