[Gluster-devel] rename(2) race condition

Emmanuel Dreyfus manu at netbsd.org
Mon May 21 16:27:21 UTC 2012


Emmanuel Dreyfus <manu at netbsd.org> wrote:

>   3548      1 tar      CALL  rename(0xbb9010e0,0x8071584)
>   3548      1 tar      NAMI  "usr/src/gnu/CVS/Tag.03548f"
>   3548      1 tar      RET   rename -1 errno 13 Permission denied

I tracked this down to FUSE LOOKUP operation that do not set
fuse_entry's attr.uid correctly (it is left set to 0).

Here is the summary of my findings so far:
- as un unprivilegied user, I create and delete files like crazy
- most of the time everything is fine
- sometime a LOOKUP for a file I created (as an unprivilegied user) will
return a fuse_entry with uid set to 0, which cause the kernel to raise
EACCESS when I try to delete the file.

Here is an example of a FUSE trace, produced by the test case
while [ 1 ] ; do cp /etc/fstab test/foo1 ; rm test/foo1 ; done

> unique = 1435, nodeid = 3098542296, opcode = LOOKUP (1)
< unique = 1435, nodeid = 3098542296, opcode = LOOKUP (1), error = -2
> unique = 1436, nodeid = 3098542296, opcode = CREATE (35)
< unique = 1436, nodeid = 3098542296, opcode = CREATE (35), error = 0
> unique = 1437, nodeid = 3098542396, opcode = SETATTR (4)
< unique = 1437, nodeid = 3098542396, opcode = SETATTR (4), error = 0
> unique = 1438, nodeid = 3098542396, opcode = WRITE (16)
< unique = 1438, nodeid = 3098542396, opcode = WRITE (16), error = 0
> unique = 1439, nodeid = 3098542396, opcode = FSYNC (20)
< unique = 1439, nodeid = 3098542396, opcode = FSYNC (20), error = 0
> unique = 1440, nodeid = 3098542396, opcode = RELEASE (18)
< unique = 1440, nodeid = 3098542396, opcode = RELEASE (18), error = 0
> unique = 1441, nodeid = 3098542396, opcode = GETATTR (3)
< unique = 1441, nodeid = 3098542396, opcode = GETATTR (3), error = 0
> unique = 1442, nodeid = 3098542296, opcode = LOOKUP (1)
< unique = 1442, nodeid = 3098542296, opcode = LOOKUP (1), error = 0

   --> here I sometimes get fuse_entry's attr.uid incorrectly set to 0 
   --> When this happens, LOOKUP fails and returns EACCESS.

> unique = 1443, nodeid = 3098542296, opcode = UNLINK (10)
< unique = 1443, nodeid = 3098542296, opcode = UNLINK (10), error = 0
> unique = 1444, nodeid = 3098542396, opcode = FORGET (2)


Is it possible that metadata writes are now so asynchronous that a
subsequent lookup cannot retreive the up to date value? If that is the
problem, how can I fix it? There is nothing telling the FUSE
implementation that a CREATE or SETATTR has just partially completed and
has metadata pending.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org




More information about the Gluster-devel mailing list