[Gluster-devel] Inconsistent behavior due to lack of lookup on entry followed by readdirp

Wed Aug 12 13:59:48 UTC 2015

Hi All,

We are facing some inconsistent behavior for fops like rename, unlink
etc due to lack of lookup followed by a readdirp, more specifically if
inodes/gfid are populated via readdirp call and this nodeid is shared
with kernal, md-cache will cache this based on base-name. Then
subsequent named lookup will be served from md-cache and it winds-back
immediately. So there is a chance to have an FOP triggered with out
having a lookup on an entry. DHT does lot of things like creating link
files and populate inode_ctx etc, during lookup. In such scenario it is
must to have at least one lookup to be happened on an entry. Since
readdirp preventing the lookup,  it has been very hard for fops to
proceed without a first lookup on the entry. We are also suspecting some
problems due to same with afr/ec self healing also. So If we remove
readdirp from md-cache ([1], [2]) it causes, an additional hop for first
lookup for every entry. I'm mostly concerned with this one extra network
call, and the performance degradation caused by the same.

Now with this, the only advantage with readdirp is, it removes one
context switch between kernal and userspace. Is it really worth to
sacrifice this for consistency ?

What do you think about removing readdirp functionality?

Please provide your input/suggestion/ideas.

[1] : http://review.gluster.org/#/c/11892/

[2] : http://review.gluster.org/#/c/11894/

Thanks in Advance
Rafi KC