[Gluster-devel] Report ESTALE as ENOENT

J. Bruce Fields bfields at fieldses.org
Wed Mar 30 19:52:46 UTC 2016


On Mon, Mar 28, 2016 at 04:21:00PM -0400, Vijay Bellur wrote:
> On 03/28/2016 09:34 AM, FNU Raghavendra Manjunath wrote:
> >
> >I can understand the concern. But I think instead of generally
> >converting all the ESTALE errors ENOENT, probably we should try to
> >analyze the errors that are generated by lower layers (like posix).
> >
> >Even fuse kernel module some times returns ESTALE. (Well, I can see it
> >returning ESTALE errors in some cases in the code. Someone please
> >correct me if thats not the case).  And aso I am not sure if converting
> >all the ESTALE errors to ENOENT is ok or not.
> 
> ESTALE in fuse is returned only for export_operations. fuse
> implements this for providing support to export fuse mounts as nfs
> exports. A cursory reading of the source seems to indicate that fuse
> returns ESTALE only in cases where filehandle resolution fails.
> 
> >
> >For fd based operations, I am not sure if ENOENT can be sent or not (as
> >though the file is unlinked, it can be accessed if there were open fds
> >on it before unlinking the file).
> 
> ESTALE should be fine for fd based operations. It would be analogous
> to a filehandle resolution failing and should not be a common
> occurrence.
> 
> >
> >I feel, we have to look into some parts to check if they generating
> >ESTALE is a proper error or not. Also, if there is any bug in below
> >layers fixing which can avoid ESTALE errors, then I feel that would be
> >the better option.
> >
> 
> I would prefer to:
> 
> 1. Return ENOENT for all system calls that operate on a path.
> 
> 2. ESTALE might be ok for file descriptor based operations.

Note that operations which operate on paths can fail with ESTALE when
they attempt to look up a component within a directory that no longer
exists.

Maybe non-creating open("./foo") returning ENOENT would be reasonable in
this case since that's what you'd get in the local filesystem case, but
creat("./foo") returning ENOENT, for example, isn't something
applications will be written to handle.

The Linux VFS will retry ESTALE on path-based systemcalls *one* time, to
reduce the chance of ESTALE in those cases.  The bugzilla entry that
tracked those patches might be interesting:

	https://bugzilla.redhat.com/show_bug.cgi?id=678544

> NFS recommends that applications add special code for handling
> ESTALE [1]. Unfortunately changing application code is not easy and
> hence it does not come as a surprise that coreutils also does not
> accommodate ESTALE.

We also need to consider whether the application's handling of the
ENOENT case could be incorrect for the ESTALE case, with consequences
possibly as bad as or worse than consequences of seeing an unexpected
error.

My first intuition is that translating ESTALE to ENOENT is less safe
than not doing so, because:

	- once an ESTALE-unaware application his the ESTALE case, we
	  risk a bug regardless of which we return, but if we return
	  ESTALE at least the problem should be more obvious to the
	  person debugging.
	- for ESTALE-aware applications, the ESTALE/ENOENT distinction
	  is useful.

But I haven't really thought through examples.

--b.


More information about the Gluster-devel mailing list