[Gluster-devel] Report ESTALE as ENOENT

Mon Mar 28 20:21:00 UTC 2016

On 03/28/2016 09:34 AM, FNU Raghavendra Manjunath wrote:
>
> I can understand the concern. But I think instead of generally
> converting all the ESTALE errors ENOENT, probably we should try to
> analyze the errors that are generated by lower layers (like posix).
>
> Even fuse kernel module some times returns ESTALE. (Well, I can see it
> returning ESTALE errors in some cases in the code. Someone please
> correct me if thats not the case).  And aso I am not sure if converting
> all the ESTALE errors to ENOENT is ok or not.

ESTALE in fuse is returned only for export_operations. fuse implements 
this for providing support to export fuse mounts as nfs exports. A 
cursory reading of the source seems to indicate that fuse returns ESTALE 
only in cases where filehandle resolution fails.

>
> For fd based operations, I am not sure if ENOENT can be sent or not (as
> though the file is unlinked, it can be accessed if there were open fds
> on it before unlinking the file).

ESTALE should be fine for fd based operations. It would be analogous to 
a filehandle resolution failing and should not be a common occurrence.

>
> I feel, we have to look into some parts to check if they generating
> ESTALE is a proper error or not. Also, if there is any bug in below
> layers fixing which can avoid ESTALE errors, then I feel that would be
> the better option.
>

I would prefer to:

1. Return ENOENT for all system calls that operate on a path.

2. ESTALE might be ok for file descriptor based operations.

NFS recommends that applications add special code for handling ESTALE 
[1]. Unfortunately changing application code is not easy and hence it 
does not come as a surprise that coreutils also does not accommodate 
ESTALE. I would not like to use NFS as a precedent for us to be commonly 
returning ESTALE back to applications.

Regards,
Vijay

[1] A10 of http://nfs.sourceforge.net/

>
>
> On Mon, Mar 28, 2016 at 1:39 AM, Prashanth Pai <ppai at redhat.com
> <mailto:ppai at redhat.com>> wrote:
>
>     TL;DR: +1 to report ESTALE as ENOENT
>
>     While ESTALE is an acceptable errno for NFS clients, it's not so
>     much for
>     FUSE clients. Many applications that talk to a FUSE mount do not handle
>     ESTALE and expect the behavior to be analogous to that of local
>     filesystems such as XFS. While it's okay for brick to send ESTALE to
>     glusterfs client stack, one has to be very careful about errno
>     returned by
>     FUSE back to applications.
>
>     For example, syscalls such as fgetxattr are not expected (at least from
>     manpage) to throw ESTALE but with glusterfs, it does[1]. Further, POSIX
>     guarantees that once an application has a valid fd, operations like
>     fgetxattr() on the fd should succeed even after another
>     application(client)
>     issues an unlink()
>
>     [1]:http://paste.openstack.org/show/335506/
>
>     Regards,
>       -Prashanth Pai
>
>     ----- Original Message -----
>      > From: "FNU Raghavendra Manjunath" <rabhat at redhat.com
>     <mailto:rabhat at redhat.com>>
>      > To: "Soumya Koduri" <skoduri at redhat.com <mailto:skoduri at redhat.com>>
>      > Cc: "Ira Cooper" <icooper at redhat.com
>     <mailto:icooper at redhat.com>>, "Gluster Devel"
>     <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
>      > Sent: Thursday, March 24, 2016 8:11:19 PM
>      > Subject: Re: [Gluster-devel] Report ESTALE as ENOENT
>      >
>      >
>      > I would still prefer not converting all the ESTALE to ENOENT. I
>     think we need
>      > to understand this specific case of parallel rm -rfs getting
>     ESTALE errors
>      > and handle it accordingly.
>      >
>      > Regarding, gfapi not honoring the ESTALE errors, I think it would
>     be better
>      > to do revalidates upon getting ESTALE.
>      >
>      > Just my 2 cents.
>      >
>      > Regards,
>      > Raghavendra
>      >
>      >
>      > On Thu, Mar 24, 2016 at 10:31 AM, Soumya Koduri <
>     skoduri at redhat.com <mailto:skoduri at redhat.com> > wrote:
>      >
>      >
>      > Thanks for the information.
>      >
>      > On 03/24/2016 07:34 PM, FNU Raghavendra Manjunath wrote:
>      >
>      >
>      >
>      > Yes. I think the caching example mentioned by Shyam is a good
>     example of
>      > ESTALE error. Also User Serviceable Snapshots (USS) relies heavily on
>      > ESTALE errors. Because the files/directories from the snapshots are
>      > assigned a virtual gfid on the fly when being looked up. If those
>     inodes
>      > are purged out of the inode table due to lru list becoming full,
>     then a
>      > access to that gfid from the client, will make snapview-server send
>      > ESTALE and either fuse (I think our fuse xlator does a revalidate
>     upon
>      > getting ESTALE) or NFS client can revalidate via path based
>     resolution.
>      >
>      > So wouldn't it be wrong not to send ESTALE to NFS-clients and map
>     it to
>      > ENOENT, as was intended in the original mail.
>      >
>      > NFSv3 rfc [1] mentions that NFS3ERR_STALE is a valid error for
>     REMOVE fop.
>      >
>      > Also (at least in gfapi) the resolve code path doesn't seem to be
>     honoring
>      > ESTALE errors - glfs_resolve_component(..),
>     glfs_refresh_inode_safe(..)
>      > etc.. Do we need to fix them?
>      >
>      >
>      > Thanks,
>      > Soumya
>      >
>      > [1] https://www.ietf.org/rfc/rfc1813.txt (section# 3.3.12)
>      >
>      >
>      >
>      >
>      > Regards,
>      > Raghavendra
>      >
>      >
>      > On Thu, Mar 24, 2016 at 9:51 AM, Shyam < srangana at redhat.com
>     <mailto:srangana at redhat.com>
>      > <mailto: srangana at redhat.com <mailto:srangana at redhat.com> >> wrote:
>      >
>      > On 03/23/2016 12:07 PM, Ravishankar N wrote:
>      >
>      > On 03/23/2016 09:16 PM, Soumya Koduri wrote:
>      >
>      > If it occurs only when the file/dir is not actually present
>      > at the
>      > back-end, shouldn't we fix the server to send ENOENT then?
>      >
>      > I never fully understood it here is the answer:
>      > http://review.gluster.org/#/c/6318/
>      >
>      >
>      > The intention of ESTALE is to state that the inode#/GFID is stale,
>      > when using that for any operations. IOW, we did not find the GFID in
>      > the backend, that does not mean the name is not present. This hence
>      > means, if you have a pGFID+bname, try resolving with that.
>      >
>      > For example, a client side cache can hang onto a GFID for a bname,
>      > but another client could have, in the interim, unlinked the bname
>      > and create a new file there.
>      >
>      > A presence test using GFID by the client that cached the result the
>      > first time, is an ESTALE. But a resolution based on pGFID+bname
>      > again by the same client would be a success.
>      >
>      > By extension, a GFID based resolution, when not really present in
>      > the backend will return ESTALE, it could very well mean ENOENT, but
>      > that has to be determined by the client again, if possible.
>      >
>      > See "A10. What does it mean when my application fails because of an
>      > ESTALE error?" for NFS here [1] and [2] (there maybe better sources
>      > for this information)
>      >
>      > [1] http://nfs.sourceforge.net/
>      > [2] https://lwn.net/Articles/272684/
>      >
>      >
>      >
>      > _______________________________________________
>      > Gluster-devel mailing list
>      > Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>     <mailto: Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org> >
>      > http://www.gluster.org/mailman/listinfo/gluster-devel
>      >
>      > _______________________________________________
>      > Gluster-devel mailing list
>      > Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>     <mailto: Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org> >
>      > http://www.gluster.org/mailman/listinfo/gluster-devel
>      >
>      >
>      >
>      >
>      > _______________________________________________
>      > Gluster-devel mailing list
>      > Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>      > http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>