[Gluster-devel] What functionality is expected from persistent NFS-client tracking?
J. Bruce Fields
bfields at fieldses.org
Wed Jan 30 20:09:38 UTC 2013
On Wed, Jan 30, 2013 at 02:31:09PM -0500, bfields wrote:
> On Fri, Jan 25, 2013 at 03:23:28PM +0100, Niels de Vos wrote:
> > Hi all,
> >
> > the last few days I have been looking into making the tracking of
> > NFS-clients more persistent. As it is today, the NFS-clients are kept in
> > a list in memory on the NFS-server. When the NFS-server restarts, the
> > list is recreated from scratch and does not contain the NFS-clients that
> > still have the export mounted (Bug 904065).
> >
> > NFSv3 depends on the MOUNT protocol. When an NFS-client mounts an
> > export, the MOUNT protocol is used to get the initial file-handle. With
> > this handle, the NFS-service can be contacted. The actual services
> > providing the MOUNT and NFSv3 protocol can be separate (Linux kernel
> > NFSd) or implemented closely together (Gluster NFS-server).
> >
> > Now, when the Linux kernel NFS-server is used, the NFS-clients are saved
> > my the rpc.mountd process (which handles the MOUNT protocol) in
> > /var/lib/nfs/rwtab. This file is modified on mounting and unmounting.
> > Implementing a persistent cache similar to this is pretty straight
> > forward and is available for testing and review in [1].
> >
> > There are however some use-cases that may require some different
> > handling. When an NFS-server starts to mount an export, the MOUNT
> > protocol is handled on a specific server. After getting the initial
> > file-handle for the export, any Gluster NFS-server can be used to talk
> > NFSv3 and do I/O. When the NFS-clients are kept only on the NFS-server
> > that handled the initial MOUNT request, and due to fail-over (think CTDB
> > and similar here) an other NFS-server is used, the persistent cache of
> > 'connected' NFS-clients is inaccurate.
> >
> > The easiest way I can think of to remedy this issue, is to place the
> > persistent NFS-client cache on a GlusterFS volume. When CTDB is used,
> > the locking-file and is placed on a shared storage as well, so the same
This is the statd data? That's the more important thing to get right.
> > volume can be used for the NFS-client cache. Providing an option to set
> > the volume/path of the NFS-client cache would be needed for this.
> > I guess that this could result in a chicken-and-egg situation
> > (NFS-server is started, but no volume mounted yet)?
I don't think there should be any problem here: the exported filesystems
need to be available before the server starts anyway. (Otherwise the
only response the server could give to operations on filehandles would
be ESTALE.)
--b.
> >
> > Any ideas or recommendations are welcome. The patch in [1] is not final
> > yet and I'd like some feedback before I proceed any further.
>
> My only comment is that this doesn't need to be perfect or even all that
> good.
>
> The list can already get out of sync in other ways: clients can just
> fail to unmount, for example.
>
> I don't think it's used by anything other than showmount.
More information about the Gluster-devel
mailing list