[Gluster-users] Ganesha+Gluster strange issue: incomplete directory reads

Ivan Rossi rouge2507 at gmail.com
Thu Aug 26 17:00:27 UTC 2021


Hello list,

Ganesha (but not Gluster) newbie here.
This is the first time I have to set-up Ganesha to serve a Gluster volume,
but
it seems i stumbled on a weird issue. Hope it is due to my inexperience with
Ganesha.

I need to serve a Gluster volume to some old production VMs that cannot
install
a recent Gluster client. Thus they need to access the Gluster data using the
standard Linux NFS client. Since native Gluster NFS server is gone, I had
to go
the Ganesha route.

Volume served is used for bioinformatics analysis, each subdirectory
containing
on the order of a thousand files, a few of them large (think 50 Gb each)

Now the issue:

When the volume is mounted on the client (using *NFSv3*) directory reads
SOMETIMES return an INCOMPLETE list of files. Problem goes away if you redo
the
read in a different way as if the first directory metadata read did not
complete successfully but it is then cached anyway.

Problem does not manifests if there are few files in the directory or they
are all small (think < 1 GB)

Direct access to the files is OK eve if they did not show up in the ls.
E.g. :

mount -t nfs ganesha:/srv/glfs/work /mnt/
ls /mnt/47194616IMS187mm10 | wc -l
# wrong result
ls: reading directory /mnt/47194616IMS187mm10: Input/output error
304

# right ( NB ls-l returns one line more than plain ls)
ls -l /mnt/47194616IMS187mm10 | wc -l
668

# after 'ls -l' now even plain ls returns the expected number of files

ls /mnt/47194616IMS187mm10 | wc -l
667

Furthermore i see the Input/output message only because of the pip to wc,
if i
just run plain ls, in a terminal it fails silently returning a partial list.

If the client mounts the volume using *NFSv4* everything looks as expected.

mount -t nfs -o vers=4.0 ganesha:/work /mnt/
ls /mnt/47194616IMS187mm10 | wc -l
667

but as you can guess my confidence in using Ganesha in production is
somewhat
shaking ATM.

My feeling is that it is a Ganesha problem or something lacking in the
Ganesha
configuration for Gluster. My Ganesha configuration is basically just
defaults.
No failover conf either.

My Gluster setup has nothing strange, I am just serving a R3 volume and
defaults are just fine to get a fast volume given the hardware. Furthermore
the
volume looks fine from the Gluster clients.

I am using Gluster 8.4 and Ganesha 3.4 on Debian 10 (buster). Packages
coming
from the Gluster and Ganesha repos, not the debian one.

Has anyone seen anything similar before?
Did I stumble on a bug?
Any advice or common wisdom to share?

Ivan Rossi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210826/36d33557/attachment.html>


More information about the Gluster-users mailing list