[Gluster-devel] Readdir d_off encoding

Jeff Darcy jdarcy at redhat.com
Mon Dec 22 19:06:42 UTC 2014


> On Mon, Dec 22, 2014 at 09:30:29AM -0500, Jeff Darcy wrote:
> > By contrast, the failure mode for the map-caching approach - a simple
> > failure in readdir - is relatively benign.  Such failures are also
> > likely to be less common, even if we adopt the *unprecedented*
> > requirement that the cache be strictly space-limited.  If we relax that
> > requirement, the problem goes away entirely.
> 
> I'm not sure what you mean by "strictly space-limited", but please note
> that some sort of cache-eviction policy will be required to keep the
> cache from growing without bound.  (NFS unfortunately allows a client to
> present the server any cookie it has ever seen; they live forever.)

My point was that we already have plenty of caches and other things that
can grow without bound in GlusterFS.  In principle we should fix that some
day.  Meanwhile, it seems a little odd to single out this particular cache
(which is likely to be small) for special treatment.

Fortunately, we have several options here:

(1) A truly fixed-size cache, with a possibility of premature eviction
if the cache is sized too small for the number of concurrent readdirs.

(2) A time-limited cache, with a possibility of premature eviction if a
client "goes to sleep" for too long and then tries to continue.

(3) Cache entries associated with particular fds, so that they go away
when the fd does (except for NFS).

(4) Any combination of the above.

None of these *completely* solve the problem.  For that we need
something besides a client-side mapping cache.  However, while I have
seen users do a lot of very strange things, running thousands of
concurrent readdir loops on a single machine wasn't one of them.  We can
handle that with a very small cache, one per GlusterFS client or NFS
server.  For a few dozen KB per node at most, we'd be far ahead of where
we are now with respect to number or severity of user complaints about
readdir problems.


More information about the Gluster-devel mailing list