[Gluster-users] Passing noforget option to glusterfs native client mounts

Tue Dec 24 08:28:11 UTC 2013

Hi,
Allowing noforget option to FUSE will not help for your cause. Gluster
persents the address of the inode_t as the nodeid to FUSE. In turn FUSE
creates a filehandle using this nodeid for knfsd to export to nfs client.
When knfsd fails over to another server, FUSE will decode the handle
encoded by the other NFS server and try to use the nodeid of the other
server - which will obviously not work as the virtual address of glusterfs
process on the other server is not valid here.

Short version: the file-handle generated through FUSE is not durable. The
"noforget" option in FUSE is a hack to avoid ESTALE messages because of
dcache pruning. If you have enough inode in your volume, your system will
go OOM at some point. The "noforget" is NOT a solution for providing NFS
failover to a different server.

For reasons such as these, we ended up implementing our own NFS server
where we encode a filehandle using the GFID (which is durable across
reboots and server failovers). I would strongly recommend NOT using knfsd
with any FUSE based filesystems (not just glusterfs) for a serious
production use, and it will just not work if you are designing for NFS high
availability/fail-over.

Thanks,
Avati

On Sat, Dec 21, 2013 at 8:52 PM, Anirban Ghoshal <
chalcogen_eg_oxygen at yahoo.com> wrote:

> If somebody has an idea on how this could be done, could you please help
> out? I am still stuck on this, apparently...
>
> Thanks,
> Anirban
>
>
>   On Thursday, 19 December 2013 1:40 AM, Chalcogen <
> chalcogen_eg_oxygen at yahoo.com> wrote:
>   P.s. I think I need to clarify this:
>
> I am only reading from the mounts, and not modifying anything on the
> server. and so the commonest causes on stale file handles do not appy.
>
> Anirban
>
> On Thursday 19 December 2013 01:16 AM, Chalcogen wrote:
>
> Hi everybody,
>
> A few months back I joined a project where people want to replace their
> legacy fuse-based (twin-server) replicated file-system with GlusterFS. They
> also have a high-availability NFS server code tagged with the kernel NFSD
> that they would wish to retain (the nfs-kernel-server, I mean). The reason
> they wish to retain the kernel NFS and not use the NFS server that comes
> with GlusterFS is mainly because there's this bit of code that allows NFS
> IP's to be migrated from one host server to the other in the case that one
> happens to go down, and tweaks on the export server configuration allow the
> file-handles to remain identical on the new host server.
>
> The solution was to mount gluster volumes using the mount.glusterfs native
> client program and then export the directories over the kernel NFS server.
> This seems to work most of the time, but on rare occasions, 'stale file
> handle' is reported off certain clients, which really puts a damper over
> the 'high-availability' thing. After suitably instrumenting the nfsd/fuse
> code in the kernel, it seems that decoding of the file-handle fails on the
> server because the inode record corresponding to the nodeid in the handle
> cannot be looked up. Combining this with the fact that a second attempt by
> the client to execute lookup on the same file passes, one might suspect
> that the problem is identical to what many people attempting to export fuse
> mounts over the kernel's NFS server are facing; viz, fuse 'forgets' the
> inode records thereby causing ilookup5() to fail. Miklos and other fuse
> developers/hackers would point towards '-o noforget' while mounting their
> fuse file-systems.
>
> I tried passing  '-o noforget' to mount.glusterfs, but it does not seem to
> recognize it. Could somebody help me out with the correct syntax to pass
> noforget to gluster volumes? Or, something we could pass to glusterfs that
> would instruct fuse to allocate a bigger cache for our inodes?
>
> Additionally, should you think that something else might be behind our
> problems, please do let me know.
>
> Here's my configuration:
>
> Linux kernel version: 2.6.34.12
> GlusterFS versionn: 3.4.0
> nfs.disable option for volumes: OFF on all volumes
>
> Thanks a lot for your time!
> Anirban
>
> P.s. I found quite a few pages on the web that admonish users that
> GlusterFS is not compatible with the kernel NFS server, but do not really
> give much detail. Is this one of the reasons for saying so?
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131224/fbb0e46b/attachment.html>