[Gluster-devel] What functionality is expected from persistent NFS-client tracking?

J. Bruce Fields bfields at fieldses.org
Wed Feb 6 20:24:38 UTC 2013


On Wed, Feb 06, 2013 at 12:06:09PM -0800, Anand Avati wrote:
> On Wed, Feb 6, 2013 at 11:33 AM, J. Bruce Fields <bfields at fieldses.org>wrote:
> 
> > On Wed, Feb 06, 2013 at 08:25:10PM +0100, Niels de Vos wrote:
> > > On Wed, Feb 06, 2013 at 01:54:28PM -0500, J. Bruce Fields wrote:
> > > > On Wed, Feb 06, 2013 at 06:19:56PM +0100, Niels de Vos wrote:
> > > > > On Thu, Jan 31, 2013 at 03:19:28PM -0500, J. Bruce Fields wrote:
> > > > > > On Thu, Jan 31, 2013 at 10:20:27AM +0100, Niels de Vos wrote:
> > > > > > > Well, the NFS-server dynamically gets exports (GlusterFS
> > volumes) added
> > > > > > > when these are started or newly created. There is no hard
> > requirement
> > > > > > > that a specific volume is available for the NFS-server to place
> > a shared
> > > > > > > files with a list of NFS-clients.
> > > > > >
> > > > > > I'm not sure what you mean by "there is not hard requirement ...".
> > > > > >
> > > > > > Surely it's a requirement that an NFS server have available at
> > startup,
> > > > > > at a minimum:
> > > > > >
> > > > > >         - all exported volumes
> > > > > >         - whichever volume contains /var/lib/nfs/statd/, if that's
> > on
> > > > > >           glusterfs.
> > > > > >
> > > > > > otherwise reboot recovery won't work.  (And failover definitely
> > won't
> > > > > > work.)
> > > > >
> > > > > Well, with the current state of things, the GlusterFS NFS-server
> > (gNFS)
> > > > > does not enforce that there are any volumes available to export.
> > These
> > > > > can be added dynamically (similar to calling exportfs for Linux
> > nfsd).
> > > >
> > > > Sure, it's fine to allow that, just as long we make sure that anything
> > > > already exported is available before server start.
> > > >
> > > > > When an NFS-client tries to mount an export immediately after gNFS
> > has
> > > > > been started, the MOUNT will return ENOENT :-/
> > > >
> > > > It's not new mounts that are the problem, it's preexisting mounts after
> > > > a server reboot:
> > > >
> > > > An application already has a file open.  The server reboots, and as
> > soon
> > > > as it's back up the client sends an operation using that filehandle.
> >  If
> > > > the server fails to recognize the filehandle and returns ESTALE, that
> > > > ESTALE gets returned to the application--definitely a bug.
> > >
> > > Yes, I understand that now. Currently the NFS server starts listening,
> > > and volumes to export will be added a little later...
> > >
> > > > So for correct reboot recovery support, any export in use on the
> > > > previous boot has to be back up before the NFS server starts listening
> > > > for rpc's.
> > >
> > > Which is not the case at the moment.
> > >
> > > > (Alternatively the server could look at the filehandle, recognize that
> > > > it's for a volume that hasn't come up yet, and return EJUKEBOX.  I
> > don't
> > > > think gluster does that.)
> > >
> > > I very much doubt that as well. The error is defined in the sources, but
> > > I do not see any usages. IMHO it is easier to make the exports available
> > > in the NFS-server before listening for incoming RPC connections. Trying
> > > to return EJUKEBOX would probably require knowledge of the volumes
> > > anyway.
> >
> > Yes, agreed.
> >
> > > Thanks for these details explanations, at least I think I understand
> > > what needs to be done before GlusterFS can offer a true high-available
> > > NFS-server. (And hopefully I find some time to extend/improve the
> > > behaviour bit by bit.)
> >
> > Note that reboot recovery is something that users generally expect to
> > Just Work on any NFS server, so this is a bug to fix even before making
> > HA work.
> >
> >
> This already works. Every filehandle has a volume-id associated with it,
> and if an NFS request arrives with a filehandle whose volume has not (yet)
> "started", we just drop the request (as though it was lost on the wire).

Oh, OK, good, sorry for the noise in that case.

> The code path which follows the RPCSVC_ACTOR_IGNORE error value does this.
> However looks like we can return EJUKEBOX error to the client instead. Did
> not know about that. How critical is it to change the current behavior of
> dropping the request to returning EJUKEBOX?

EJUKEBOX is the better response, but it's probably not critical.  Note
it was introduced with NFSv3, so you'd still need to drop in the case of
NFSv2 clients.  (But I don't think gluster supports NFSv2?)

--b.




More information about the Gluster-devel mailing list