[Gluster-users] gluster NFS hang observed mounting or umounting at scale

Sun Jan 26 18:04:00 UTC 2020

> One last reply to myself.

One of the test cases my test scripts triggered turned out to actually
be due to my NFS RW mount options.

OLD RW NFS mount options:
"rw,noatime,nocto,actimeo=3600,lookupcache=all,nolock,tcp,vers=3"

NEW options that work better
rw,noatime,nolock,tcp,vers=3"

I had copied the RO NFS options we use which try to be aggressive about
caching. The RO root image doesn't change much and we want it as fast
as possible. The options are not appropriate for RW areas that change.
(Even though it's a single image file we care about).

So now my test scripts run clean but since what we see on larger systems
is right after reboot, the caching shouldn't matter. In the real problem
case, the RW stuff is done once after reboot.

FWIW I attached my current test scripts, my last batch had some errors.

The search continues for the actual problem, which I'm struggling to
reproduce @ 366 NFs clients.

I believe yesterday, when I posted about actual HANGS, that is the real
problem we're tracking. I hit that once in my test scripts - only once.
My script was otherwise hitting a "file doesn't really exist even though
cached" issue and it was tricking my scripts.

In any case, I'm changing the RW NFS options we use regardless.

Erik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nfs-issues.tar.xz
Type: application/x-xz
Size: 20212 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200126/890949c7/attachment.xz>