[Gluster-users] incomplete listing of a directory, sometimes getdents loops until out of memory

John Brunelle john_brunelle at harvard.edu
Fri Jun 14 18:36:01 UTC 2013

Ah, I did not know that about 0x7ffffffff.  Is it of note that the
clients do *not* get this?

This is on an NFS mount, and the volume has nfs.enable-ino32 On.  (I
should've pointed that out again when Jeff mentioned FUSE.)

Side note -- we do have a couple FUSE mounts, too, and I had not seen
this issue on any of them before, but when I checked now, zero
subdirectories were listed on some.  Since I had only seen this on NFS
clients after setting cluster.readdir-optimize On, I have now set that
back Off.  FUSE mounts are now behaving fine again.



On Fri, Jun 14, 2013 at 2:17 PM, Anand Avati <anand.avati at gmail.com> wrote:
> Are the ls commands (which list partially, or loop and die of ENOMEM
> eventually) executed on an NFS mount or FUSE mount? Or does it happen on
> both?
> Avati
> On Fri, Jun 14, 2013 at 11:14 AM, Anand Avati <anand.avati at gmail.com> wrote:
>> On Fri, Jun 14, 2013 at 10:04 AM, John Brunelle
>> <john_brunelle at harvard.edu> wrote:
>>> Thanks, Jeff!  I ran readdir.c on all 23 bricks on the gluster nfs
>>> server to which my test clients are connected (one client that's
>>> working, and one that's not; and I ran on those, too).  The results
>>> are attached.
>>> The values it prints are all well within 32 bits, *except* for one
>>> that's suspiciously the max 32-bit signed int:
>>> $ cat readdir.out.* | awk '{print $1}' | sort | uniq | tail
>>> 0x000000000000fd59
>>> 0x000000000000fd6b
>>> 0x000000000000fd7d
>>> 0x000000000000fd8f
>>> 0x000000000000fda1
>>> 0x000000000000fdb3
>>> 0x000000000000fdc5
>>> 0x000000000000fdd7
>>> 0x000000000000fde8
>>> 0x000000007fffffff
>>> That outlier is the same subdirectory on all 23 bricks.  Could this be
>>> the issue?
>>> Thanks,
>>> John
>> 0x7ffffffff is the EOF marker. You should find that as last entry in
>> _every_ directory.
>> Avati

