[Gluster-devel] Locking behavior vs rmdir/unlink of a directory/file
Raghavendra Gowdappa
rgowdapp at redhat.com
Thu Aug 20 05:20:05 UTC 2015
To put the problem in simple words, A lock is granted by posix-locks xlator even after a directory is deleted on backend.
----- Original Message -----
> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> To: "Gluster Devel" <gluster-devel at gluster.org>
> Cc: "Sakshi Bansal" <sabansal at redhat.com>
> Sent: Thursday, August 20, 2015 10:31:55 AM
> Subject: Re: [Gluster-devel] Locking behavior vs rmdir/unlink of a directory/file
>
>
>
> ----- Original Message -----
> > From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> > To: "Gluster Devel" <gluster-devel at gluster.org>
> > Cc: "Sakshi Bansal" <sabansal at redhat.com>
> > Sent: Thursday, August 20, 2015 10:24:46 AM
> > Subject: [Gluster-devel] Locking behavior vs rmdir/unlink of a
> > directory/file
> >
> > Hi all,
> >
> > Most of the code currently treats inode table (and dentry structure
> > associated with that) as the correct representative of underlying backend
> > file-system. While this is correct for most of the cases, the
> > representation
> > might be out of sync for small time-windows (like file deleted on disk, but
> > dentry and inode is not removed in our inode table etc). While working on
> > locking directories in dht for better consistency we ran into one such
> > issue. The issue is basically to make rmdir and directory creation during
> > dht-selfheal mutually exclusive. The idea is to have a blocking inodelk on
> > inode before proceeding with rmdir or directory self-heal. However,
> > consider
> > following scenario:
> >
> > 1. (dht_)rmdir acquires a lock.
> > 2. lookup-selfheal tries to acquire a lock, but is blocked on lock acquired
> > by rmdir.
> > 3. rmdir deletes directory and unlocks the lock. Its possible for inode to
> > remain in inode table and searchable through gfid till there is a positive
> > reference count on it. In this case lock-request (by lookup) and
> > granted-lock (to rmdir) makes the inode to remain in inode table even after
> > rmdir.
>
> as both of them have a refcount each on inode.
>
> > 4. lock request issued by lookup is granted.
> >
> > Note that at step 4, its still possible rmdir might be in progress from dht
> > perspective (it just completed on one node). However, this is precisely the
> > situation we wanted to avoid i.e., we wanted to block and fail dht-selfheal
> > instead of allowing it to proceed.
> >
> > In this scenario at step 4, the directory is removed on backend
> > file-system,
> > but its representation is still present in inode table. We tried to solve
> > this by doing a lookup on gfid before granting a lock [1]. However, because
> > of [1]
> >
> > 1. we no longer treat inode table as source of truth as opposed to other
> > non-lookup code
> > 2. performance hit in terms of a lookup on backend-filesystem for _every_
> > granted lock. This may not be as big considering that there is no network
> > call involved.
> >
> > There are other ways where dht could've avoided above scenario altogether
> > with different trade-offs we didn't want to make. Few alternatives would've
> > been,
> > 1. use entrylk during lookup-selfheal and rmdir. This fits naturally as
> > both
> > are entry operations. However, dht-selfheal also sets layouts which should
> > be synchronized other operations where we don't have name information.
> > tl;dr
> > we wanted to avoid using entrylk for reasons that are out of scope for this
> > problem.
> > 2. Use non-blocking inodelk by dht during lookup-selfheal. This solves the
> > problem for most of the practical cases, but theoretically race can still
> > exist.
> >
> > To summarize, the problem of granted-locks and unlink/rmdir still remains
> > and
> > I am not sure what exactly should be the behavior of posix-locks in that
> > scenario. Inputs in way of review on [1] are greatly appreciated.
> >
> > [1] http://review.gluster.org/#/c/11916/
> >
> > regards,
> > Raghavendra.
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> >
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list