[Gluster-devel] Report ESTALE as ENOENT

Wed Oct 11 14:02:53 UTC 2017

On Wed, Oct 11, 2017 at 04:11:51PM +0530, Raghavendra G wrote:
> On Thu, Mar 31, 2016 at 1:22 AM, J. Bruce Fields <bfields at fieldses.org>
> wrote:
> 
> > On Mon, Mar 28, 2016 at 04:21:00PM -0400, Vijay Bellur wrote:
> > > I would prefer to:
> > >
> > > 1. Return ENOENT for all system calls that operate on a path.
> > >
> > > 2. ESTALE might be ok for file descriptor based operations.
> >
> > Note that operations which operate on paths can fail with ESTALE when
> > they attempt to look up a component within a directory that no longer
> > exists.
> >
> 
> But, "man 2 rmdir"  or "man 2 unlink" doesn't list ESTALE as a valid error.

In fact, almost no man pages list ESTALE as a valid error:

	[bfields at patate man-pages]$ git grep ESTALE
	Changes.old:        Change description for ESTALE
	man2/open_by_handle_at.2:.B ESTALE
	man2/open_by_handle_at.2:.B ESTALE
	man3/errno.3:.B ESTALE

Cc'ing Michael Kerrisk for advice.  Is there some reason for that, or
can we fix those man pages?

> Also rm doesn't seem to handle ESTALE too [3]
> 
> [4] https://github.com/coreutils/coreutils/blob/master/src/remove.c#L305

I *think* that code is just deciding whether a given error should be
silently ignored in the rm -f case.  I don't think -ESTALE (indicating
the directory is bad) is such an error, so I think this code is correct.
But my understanding may be wrong.

> > Maybe non-creating open("./foo") returning ENOENT would be reasonable in
> > this case since that's what you'd get in the local filesystem case, but
> > creat("./foo") returning ENOENT, for example, isn't something
> > applications will be written to handle.
> >
> > The Linux VFS will retry ESTALE on path-based systemcalls *one* time, to
> > reduce the chance of ESTALE in those cases.
> 
> 
> I should've anticipated bug [2] due to this comment. My mistake. Bug [2] is
> indeed due to kernel not retrying open on receiving an ENOENT error.
> Glusterfs sent ENOENT because file's inode-number/nodeid changed but same
> path exists. The correct error would've been ESTALE, but due to our
> conversion of ESTALE to ENOENT, the latter was sent back to kernel.
> 
> Looking through kernel VFS code, only open *seems* to retry
> (do_filep_open). I couldn't find similar logic to other path based syscalls
> like rmdir, unlink, stat, chmod etc

I believe there is a retry in those cases, but I'm not sure exactly
where it is.  Looking around.... See the retry_estale() checks sprinkled
around namei.c, which were added by Jeff Layton a few years ago.

--b.