[Gluster-devel] How does GD_SYNCOP work?

Emmanuel Dreyfus manu at netbsd.org
Sat Sep 13 12:46:34 UTC 2014


Emmanuel Dreyfus <manu at netbsd.org> wrote:

> Here is the problem: once readdir() has reached the end of the
> directory, on Linux, telldir() will report the last entry's offset,
> while on NetBSD, it will report an invalid offset (it is in fact the
> offset of the next entry beyond the last one, which does not exist).

But that difference did not explain why NetBSD was looping. I discovered
why.

Between each index_fill_readdir() invocation, we have a closedir()/opendir()
invocation. Then index_fill_readdir()  calls seekdir() with a pointer
obtained from telldir() on the previously open/closed DIR *. Offsets
returned by telldir() are only valid for a DIR * lifetime [1]. Such rule
makes sense: If the directory content changed, we are likely to return
garbage.

Now if the directory content did not change and we have readen everything,
here is what happens:

On Linux, seekdir() works with the offset obtained from previous DIR * (it
does not have to according to the standards), and goes to the last entry. It
exits gracefuly returning EOF.

On NetBSD, seekdir() is given the offset from previous DIR * beyond the last
entry. It fails and is nilpotent. Subsequent readdir_r() will operate from
the beginning of the directory, and we never get EOF. Here is our infinite
loop.

The correct fix is:

1) either to keep the directory open between index_fill_readdir()
invocations, but since that means preserving an open directory accross
different syncop, I am not sure it is a good idea.

2) do not reuse the offset from last attempt. That means if the buffer get
filled, resize it as bigger and retry, until the data fits. This is bad
performance wise, but it seems the only safe way to me.

Opinions?


[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org


More information about the Gluster-devel mailing list