[Gluster-devel] Stale NFS file handle, then EINVAL
Emmanuel Dreyfus
manu at netbsd.org
Fri Jul 22 01:55:05 UTC 2011
Pavan T C <tcp at gluster.com> wrote:
> When lookup is sent, the details of the ondisk file (particularly gfid)
> is fetched. In the lookup callback, a comparison with this gfid is made
> with one in the local inode cache. If this inode is found in the local
> cache and the local gfid differs from the one obtained from the disk,
> ESTALE is returned to FUSE. FUSE will consider this to be a revalidate
> case, and should send a revalidate lookup to update it's cache, and
> should *not* pass this error back to VFS. If the error is passed back to
> VFS, the application can report "Stale NFS file handle".
The problem here is not with LOOKUP, but with READDIR. I though that there was
some race condition and the offending files were changing but this not the
case: it always happens on the same files, but I have no bug if they are find
by LOOKUP. Only READDIR can trigger the problem.
Discover Makefile.am by READDIR gets ESTALE:
client# umount /gfs && mount /gfs
client# cd /gfs/usr/src/gnu/dist/binutils/bfd/ && ls -l
ls: ChangeLog-9193: Stale NFS file handle
ls: ChangeLog-9697: Stale NFS file handle
ls: Makefile.am: Stale NFS file handle
(...)
-rw-r--r-- 1 root wheel 18009 Nov 26 2003 COPYING
drwxr-xr-x 2 root wheel 1024 Nov 6 2010 CVS
-rw-r--r-- 1 root wheel 256777 Feb 2 2006 ChangeLog
-rw-r--r-- 1 root wheel 350400 Nov 26 2003 ChangeLog-0001
-rw-r--r-- 1 root wheel 442601 Dec 8 2004 ChangeLog-0203
-rw-r--r-- 1 root wheel 411353 Nov 26 2003 ChangeLog-9495
-rw-r--r-- 1 root wheel 206494 Nov 26 2003 ChangeLog-9899
(...)
Discovering Makefile.am by LOOKUP is fine:
client# umount /gfs && mount /gfs
client# cd /gfs/usr/src/gnu/dist/binutils/bfd/ && ls -l Makefile.am
-rw-r--r-- 1 root wheel 66102 Feb 2 2006 Makefile.am
On the backend, I can tell the difference between files that trigger the bug
and the others: The bad ones have gfid out of sync with their linkto
counterpart on the other replica. The difference never heals. Here is a file
that triggers a ESTALE when I discover it through READDIR:
server# ls -l /export/*/usr/src/gnu/dist/binutils/bfd/Makefile.am
---------T 1 root wheel 0 Jul 22 03:33
/export/wd1a/usr/src/gnu/dist/binutils/bfd/Makefile.am
-rw-r--r-- 1 root wheel 66102 Feb 2 2006
/export/wd3a/usr/src/gnu/dist/binutils/bfd/Makefile.am
server# getextattr -x trusted.gfid \
/export/*/usr/src/gnu/dist/binutils/bfd/Makefile.am
/export/wd1a/usr/src/gnu/dist/binutils/bfd/Makefile.am
000 3a 4c 41 78 86 59 4f ab 97 65 d5 a6 f8 c5 5f 4d :LAx.YO..e...._M
/export/wd3a/usr/src/gnu/dist/binutils/bfd/Makefile.am
000 61 37 c5 59 90 83 42 88 a7 b8 ee 86 58 66 1f 13 a7.Y..B.....Xf..
And here is a file that has no problem:
server# ls -l /export/*/usr/src/gnu/dist/binutils/bfd/COPYING
---------T 1 root wheell 0 Jul 19 09:45
/export/wd1a/usr/src/gnu/dist/binutils/bfd/COPYING
-rw-r--r-- 1 root wheel 18009 Nov 26 2003
/export/wd3a/usr/src/gnu/dist/binutils/bfd/COPYING
server# getextattr -x trusted.gfid \
/export/*/usr/src/gnu/dist/binutils/bfd/COPYING
000 0b fb b5 6a 7d c0 4d 22 98 f8 9d 6c 64 5b ab b7 ...j}.M"...ld[..
/export/wd3a/usr/src/gnu/dist/binutils/bfd/COPYING
000 0b fb b5 6a 7d c0 4d 22 98 f8 9d 6c 64 5b ab b7 ...j}.M"...ld[..
As I understand, on READDIR the glusterfs server reports the gfid of the
shadow linkto file to the client, and subsequent file usage will report the
correct gfid, leading to the mismatch.
You suggest that the FUSE implementation should filter out ESTALE by inssuing
another LOOKUP? I can implement this, but I have trouble to understand why you
have a bug report on this problem if Linux FUSE does that.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org
More information about the Gluster-devel
mailing list