[Gluster-devel] Report ESTALE as ENOENT
rwheeler at redhat.com
Wed Oct 18 06:39:37 UTC 2017
Adding in Eric and Steve.
On Oct 18, 2017 9:35 AM, "Raghavendra G" <raghavendra at gluster.com> wrote:
On Wed, Oct 11, 2017 at 4:11 PM, Raghavendra G <raghavendra at gluster.com>
> We ran into a regression . Hence reviving this thread.
>  https://bugzilla.redhat.com/show_bug.cgi?id=1500269
>  https://review.gluster.org/18463
> On Thu, Mar 31, 2016 at 1:22 AM, J. Bruce Fields <bfields at fieldses.org>
>> On Mon, Mar 28, 2016 at 04:21:00PM -0400, Vijay Bellur wrote:
>> > On 03/28/2016 09:34 AM, FNU Raghavendra Manjunath wrote:
>> > >
>> > >I can understand the concern. But I think instead of generally
>> > >converting all the ESTALE errors ENOENT, probably we should try to
>> > >analyze the errors that are generated by lower layers (like posix).
>> > >
>> > >Even fuse kernel module some times returns ESTALE. (Well, I can see it
>> > >returning ESTALE errors in some cases in the code. Someone please
>> > >correct me if thats not the case). And aso I am not sure if converting
>> > >all the ESTALE errors to ENOENT is ok or not.
>> > ESTALE in fuse is returned only for export_operations. fuse
>> > implements this for providing support to export fuse mounts as nfs
>> > exports. A cursory reading of the source seems to indicate that fuse
>> > returns ESTALE only in cases where filehandle resolution fails.
>> > >
>> > >For fd based operations, I am not sure if ENOENT can be sent or not (as
>> > >though the file is unlinked, it can be accessed if there were open fds
>> > >on it before unlinking the file).
>> > ESTALE should be fine for fd based operations. It would be analogous
>> > to a filehandle resolution failing and should not be a common
>> > occurrence.
>> > >
>> > >I feel, we have to look into some parts to check if they generating
>> > >ESTALE is a proper error or not. Also, if there is any bug in below
>> > >layers fixing which can avoid ESTALE errors, then I feel that would be
>> > >the better option.
>> > >
>> > I would prefer to:
>> > 1. Return ENOENT for all system calls that operate on a path.
>> > 2. ESTALE might be ok for file descriptor based operations.
>> Note that operations which operate on paths can fail with ESTALE when
>> they attempt to look up a component within a directory that no longer
> But, "man 2 rmdir" or "man 2 unlink" doesn't list ESTALE as a valid
> error. Also rm doesn't seem to handle ESTALE too 
>  https://github.com/coreutils/coreutils/blob/master/src/remove.c#L305
>> Maybe non-creating open("./foo") returning ENOENT would be reasonable in
>> this case since that's what you'd get in the local filesystem case, but
>> creat("./foo") returning ENOENT, for example, isn't something
>> applications will be written to handle.
>> The Linux VFS will retry ESTALE on path-based systemcalls *one* time, to
>> reduce the chance of ESTALE in those cases.
> I should've anticipated bug  due to this comment. My mistake. Bug 
> is indeed due to kernel not retrying open on receiving an ENOENT error.
> Glusterfs sent ENOENT because file's inode-number/nodeid changed but same
> path exists. The correct error would've been ESTALE, but due to our
> conversion of ESTALE to ENOENT, the latter was sent back to kernel.
We've an application which does very frequent renames (around 10-15 per
second). So, single retry by kernel of an open failed with ESTALE is not
Do you know why VFS chose an approach of retrying instead of a stricter
synchronization mechanism using locking? For eg., rename and open could've
been synchronized using a lock.
For eg., a rough psuedocode for open and rename could've been (please
ignore ordering src,dst locks in rename)
rename (src, dst);
With the current retry model, any suggestions on how to handle applications
that do frequent renames?
> Looking through kernel VFS code, only open *seems* to retry
> (do_filep_open). I couldn't find similar logic to other path based syscalls
> like rmdir, unlink, stat, chmod etc
> The bugzilla entry that
>> tracked those patches might be interesting:
>> > NFS recommends that applications add special code for handling
>> > ESTALE . Unfortunately changing application code is not easy and
>> > hence it does not come as a surprise that coreutils also does not
>> > accommodate ESTALE.
>> We also need to consider whether the application's handling of the
>> ENOENT case could be incorrect for the ESTALE case, with consequences
>> possibly as bad as or worse than consequences of seeing an unexpected
>> My first intuition is that translating ESTALE to ENOENT is less safe
>> than not doing so, because:
>> - once an ESTALE-unaware application his the ESTALE case, we
>> risk a bug regardless of which we return, but if we return
>> ESTALE at least the problem should be more obvious to the
>> person debugging.
>> - for ESTALE-aware applications, the ESTALE/ENOENT distinction
>> is useful.
> Another place to not convert is for those cases where kernel retries the
> operation on seeing an ESTALE.
> I guess we need to think through each operation and we cannot ESTALE to
> ENOENT always.
>> But I haven't really thought through examples.
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
> Raghavendra G
Gluster-devel mailing list
Gluster-devel at gluster.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-devel