[Gluster-devel] Report ESTALE as ENOENT

Raghavendra G raghavendra at gluster.com
Mon Feb 26 05:50:49 UTC 2018


On Fri, Feb 23, 2018 at 6:33 AM, J. Bruce Fields <bfields at fieldses.org>
wrote:

> On Thu, Feb 22, 2018 at 01:17:58PM +0530, Raghavendra G wrote:
> > On Wed, Oct 11, 2017 at 7:32 PM, J. Bruce Fields <bfields at fieldses.org>
> > wrote:
> >
> > > On Wed, Oct 11, 2017 at 04:11:51PM +0530, Raghavendra G wrote:
> > > > On Thu, Mar 31, 2016 at 1:22 AM, J. Bruce Fields <
> bfields at fieldses.org>
> > > > wrote:
> > > >
> > > > > On Mon, Mar 28, 2016 at 04:21:00PM -0400, Vijay Bellur wrote:
> > > > > > I would prefer to:
> > > > > >
> > > > > > 1. Return ENOENT for all system calls that operate on a path.
> > > > > >
> > > > > > 2. ESTALE might be ok for file descriptor based operations.
> > > > >
> > > > > Note that operations which operate on paths can fail with ESTALE
> when
> > > > > they attempt to look up a component within a directory that no
> longer
> > > > > exists.
> > > > >
> > > >
> > > > But, "man 2 rmdir"  or "man 2 unlink" doesn't list ESTALE as a valid
> > > error.
> > >
> > > In fact, almost no man pages list ESTALE as a valid error:
> > >
> > >         [bfields at patate man-pages]$ git grep ESTALE
> > >         Changes.old:        Change description for ESTALE
> > >         man2/open_by_handle_at.2:.B ESTALE
> > >         man2/open_by_handle_at.2:.B ESTALE
> > >         man3/errno.3:.B ESTALE
> > >
> > > Cc'ing Michael Kerrisk for advice.  Is there some reason for that, or
> > > can we fix those man pages?
> > >
> > > > Also rm doesn't seem to handle ESTALE too [3]
> > > >
> > > > [4] https://github.com/coreutils/coreutils/blob/master/src/
> remove.c#L305
> > >
> > > I *think* that code is just deciding whether a given error should be
> > > silently ignored in the rm -f case.  I don't think -ESTALE (indicating
> > > the directory is bad) is such an error, so I think this code is
> correct.
> > > But my understanding may be wrong.
> > >
> >
> > For a local filesystem, we may not end up in ESTALE errors. But, when
> rmdir
> > is executed from multiple clients of a network fs (like NFS, Glusterfs),
> > unlink or rmdir can easily fail with ESTALE as the other rm invocation
> > could've deleted it. I think this is what has happened in bugs like:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1546717
> > https://bugzilla.redhat.com/show_bug.cgi?id=1245065
> >
> > This in fact was the earlier motivation to convert ESTALE into ENOENT, so
> > that rm would ignore it. Now that I reverted the fix, looks like the bug
> > has promptly resurfaced :)
> >
> > There is one glitch though. Bug 1245065 mentions that some parts of
> > directory structure remain undeleted. From my understanding, atleast one
> > instance of rm (which is racing ahead of all others causing others to
> > fail), should've delted the directory structure completely. Though, I
> need
> > to understand the directory traversal done by rm to find whether there
> are
> > cyclic dependency between two rms causing both of them to fail.
>
> I don't see how you could avoid that.  The clients are each caching
> multiple subdirectories of the tree, and there's no guarantee that 1
> client has fresher caches of every subdirectory.  There's also no
> guarantee that the client that's ahead stays ahead--another client that
> sees which objects the first client has already deleted can leapfrog
> ahead.
>

What are the drawbacks of applications (like rm) treating ESTALE equivalent
of ENOENT? It seems to me, from the application perspective they both
convey similar information. If rm could ignore ESTALE just like it does for
ENOENT, probably we don't run into this issue.


> I think the solution is just not to do that--NFS clients aren't really
> equipped to handle directory operations on directories that are deleted
> out from under them, and there probably aren't any hacks on the server
> side that will fix that.  If there's a real need for this kind of case,
> we may need to work on the protocol itself.  For now all we may be able
> to do is educate users about what NFS can and can't do.
>
> --b.
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180226/5d271882/attachment.html>


More information about the Gluster-devel mailing list