[Gluster-devel] inode linking in GlusterFS NFS server

Tue Jul 8 05:46:31 UTC 2014

On Tuesday 08 July 2014 01:21 AM, Anand Avati wrote:
> On Mon, Jul 7, 2014 at 12:48 PM, Raghavendra Bhat <rabhat at redhat.com 
> <mailto:rabhat at redhat.com>> wrote:
>
>
>     Hi,
>
>     As per my understanding nfs server is not doing inode linking in
>     readdirp callback. Because of this there might be some errors
>     while dealing with virtual inodes (or gfids). As of now meta,
>     gfid-access and snapview-server (used for user serviceable
>     snapshots) xlators makes use of virtual inodes with random gfids.
>     The situation is this:
>
>     Say User serviceable snapshot feature has been enabled and there
>     are 2 snapshots ("snap1" and "snap2"). Let /mnt/nfs be the nfs
>     mount. Now the snapshots can be accessed by entering .snaps
>     directory.  Now if snap1 directory is entered and *ls -l* is done
>     (i.e. "cd /mnt/nfs/.snaps/snap1" and then "ls -l"),  the readdirp
>     fop is sent to the snapview-server xlator (which is part of a
>     daemon running for the volume), which talks to the corresponding
>     snapshot volume and gets the dentry list. Before unwinding it
>     would have generated random gfids for those dentries.
>
>     Now nfs server upon getting readdirp reply, will associate the
>     gfid with the filehandle created for the entry. But without
>     linking the inode, it would send the readdirp reply back to nfs
>     client. Now next time when nfs client makes some operation on one
>     of those filehandles, nfs server tries to resolve it by finding
>     the inode for the gfid present in the filehandle. But since the
>     inode was not linked in readdirp, inode_find operation fails and
>     it tries to do a hard resolution by sending the lookup operation
>     on that gfid to the normal main graph. (The information on whether
>     the call should be sent to main graph or snapview-server would be
>     present in the inode context. But here the lookup has come on a
>     gfid with a newly created inode where the context is not there
>     yet. So the call would be sent to the main graph itself). But
>     since the gfid is a randomly generated virtual gfid (not present
>     on disk), the lookup operation fails giving error.
>
>     As per my understanding this can happen with any xlator that deals
>     with virtual inodes (by generating random gfids).
>
>     I can think of these 2 methods to handle this:
>     1)  do inode linking for readdirp also in nfs server
>     2)  If lookup operation fails, snapview-client xlator (which
>     actually redirects the fops on snapshot world to snapview-server
>     by looking into the inode context) should check if the failed
>     lookup is a nameless lookup. If so, AND the gfid of the inode is
>     NULL AND lookup has come from main graph, then instead of
>     unwinding the lookup with failure, send it to snapview-server
>     which might be able to find the inode for the gfid (as the gfid
>     was generated by itself, it should be able to find the inode for
>     that gfid unless and until it has been purged from the inode table).
>
>
>     Please let me know if I have missed anything. Please provide feedback.
>
>
>
> That's right. NFS server should be linking readdirp_cbk inodes just 
> like FUSE or protocol/server. It has been OK without virtual gfids 
> thus far.

I did the changes to link inodes in readdirp_cbk in nfs server. It seems 
to work fine. Should we need the second change also? (i.e chage in the 
snapview-client to redirect the fresh nameless lookups to 
snapview-server). With nfs server linking the inodes in readdirp, I 
think second change might not be needed.

Regards,
Raghavendra Bhat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140708/b2bc1157/attachment.html>