[Gluster-users] Gluster (3.6.3) NFS READDIR failing intermittently from Finder on Mac OS X (10.10 and 10.11)

Niels de Vos ndevos at redhat.com
Mon Mar 14 04:00:34 UTC 2016


On Mon, Mar 14, 2016 at 01:54:13PM +1100, Brett Randall wrote:
> Just an FYI for all, we found that by adding "rdirplus" on our Macs as an
> option on the NFS mount, the problem went away. Hope this is helpful to
> someone in the future! Should really be a "Mac" wiki page on Gluster to
> summarise how best to get macs to work with Gluster :)

Thanks for the update! Could you file a bug with a .pcap file never the
less? Maybe we can figure out what is wrong and improve it in the
future.
  https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&component=nfs

The docs are on http://gluster.readthedocs.org/en/latest/ , the 'wiki'
is old and deprecated. I'm not sure what the best chapter/page would be
for client specific configurations, maybe the Administrator Guide is
suitable:
  http://gluster.readthedocs.org/en/latest/Administrator%20Guide/

The documentation is maintained on
https://github.com/gluster/glusterdocs/ , anyone should be able to fork
the repository, make changes and send a pull request.

Cheers,
Niels


> 
> Brett.
> 
> > -----Original Message-----
> > From: Niels de Vos [mailto:ndevos at redhat.com]
> > Sent: Thursday, 10 March 2016 7:55 PM
> > To: Brett Randall <brett.randall at gmail.com>
> > Cc: gluster-users at gluster.org
> > Subject: Re: [Gluster-users] Gluster (3.6.3) NFS READDIR failing
> intermittently
> > from Finder on Mac OS X (10.10 and 10.11)
> > 
> > On Thu, Mar 10, 2016 at 06:18:44PM +1100, Brett Randall wrote:
> > > Hi all
> > >
> > >
> > >
> > > I have a problem which is doing my head in.
> > >
> > >
> > >
> > > We are running Gluster 3.6.3 with the in-built NFS server, across 8
> servers.
> > > We share our volume out with SMB, AFP and Gluster's NFS server.
> > >
> > >
> > >
> > > In most cases, NFS works fine. Everything is visible and accessible
> > > from the terminal. But from Finder on our Macs, we are having a
> consistent
> > problem.
> > >
> > >
> > >
> > > Firstly, we are mounting the share from the command line:
> > >
> > >
> > >
> > > $ mount -t nfs -o rw,intr,nolock,tcp 10.0.19.31:/glusvol ./glusvol
> > >
> > >
> > >
> > > We then open Finder and traverse to the folder in question (about 7
> > > levels deep). I see about 20-30 items, but I know there are 100+ items
> in
> > there.
> > > This is the case on multiple folders. If I open a terminal, go to that
> > > folder, and create a new empty file, the folder refreshes in Finder
> > > and I can see everything. However, dismount and remount and everything
> > > is gone again (although sometimes it displays all files for a few
> > > seconds before most of them disappear). I've repeated this on three
> > > different Macs of varying origin and OS version.
> > >
> > >
> > >
> > > I've started Wireshark on my Mac and monitored what is happening. It
> > > appears that there is an initial NFS READDIR Call to the NFS server
> > > with cookie set to 0. The READDIR Reply contains the filename of every
> file
> > in the folder.
> > > Then there is another READDIR call with cookie set to 4096, which
> > > happens to be the last cookie listed in the previous reply. Curiously,
> > > the reply to this call lists all the files that I *cannot* see in
> > > Finder. But doesn't include the ones I can see. Then there are a whole
> > > lot of LOOKUP Calls while it looks at all the files that I *can* see.
> > > Then it stops at the 24th file, the last file I can see in Finder. It
> > > then issues another READDIR Call with a Cookie of 680. The Reply is
> > > "NFS3ERR_BAD_COOKIE". Looking through the previous replies, the only
> > > time that cookie was issued was in the FIRST reply. And again, the
> > > file in question with that cookie number is the LAST file that I can see
> in
> > Finder.
> > >
> > >
> > >
> > > Surely, Finder cannot be THIS broken? I can see all files in that
> > > folder fine when I mount via AFS or SMB but not via NFS. But it all
> > > works fine from Terminal. We're experimenting with updating Gluster to
> > > 3.7.8 and moving to NFS Ganesha in the hope that moving to NFSv4 fixes
> > > it, but does anyone have any idea what's happening? I'm happy to send
> > > the .pcapng file to someone if it's helpful. I also have a .pcapng of
> > > when we create a file in the folder and Finder refreshes to show
> > > everything in there. The only interesting thing that I noticed in that
> > > file is that the cookie number at the end of the READDIR is much
> > > larger than anything I was seeing in the failed listings
> > > (17179869176). I tried forcing 32-bit inode sizes in Gluster NFS
> > > options (the closest thing I could find to NFS's native 32-bit cookie
> > > size
> > > restriction) with no joy, just in case that was part of it, which
> > > wouldn't make sense but tried anyway and no difference.
> > 
> > It is possible that Finder does not follow the NFSv3 specification
> correctly. I
> > have seen that some other OS's expect the cookie or inode to be 32-bit.
> This
> > is the case for most filesystems, but Gluster uses 64-bit values. A
> subsequent
> > READDIR(P) would use a partial cookie for continuation, and that can
> result
> > in very strange behaviour.
> > 
> > Only exposing 32-bit inodes over Gluster/NFS might be the solution for
> you.
> > You can enable this with
> > 
> >     # gluster volume set ${VOLUME} nfs.enable-ino32 on
> > 
> > Unmount and re-mount the NFS-export after changing this option.
> > 
> > It is possible that the NFSv4 client on Mac OS X handles things better,
> but it
> > could have the same issues too.
> > 
> > HTH,
> > Niels
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160314/87a0d081/attachment.sig>


More information about the Gluster-users mailing list