[Gluster-devel] write-behind bug with ftruncate

Emmanuel Dreyfus manu at netbsd.org
Sun Jul 17 08:59:09 UTC 2011


Pavan T C <tcp at gluster.com> wrote:

> If your version of NetBSD has dtrace ported and enabled, you can check 
> if the reordering of the calls is happening within fuse at runtime 
> without modifying fuse.

NetBSD FUSE is in userland, and I am actively developing it, therefore it is
not a problem for me to modify it.

However, the reordering does not occur in FUSE, and it seems i was wrong
about write-behind, and that removing it just made the bug disapear by
chance. 

As I now understand, the problem is that fuse_setattr_cbk() will request a
ftruncate() after the SETATTR. Here is what I get in the logs:

fuse_write()    size = 4096, offset = 39981056
fuse_setattr()  fsi->valid = 0x78 => truncate_needed,  size = 39987632
fuse_write()    size = 20480, offset = 39985152
(...)
client3_1_writev()      size = 4096, offset = 39981056
fuse_setattr_cbk()      call fuse_do_truncate, offset = 39987632
client3_1_writev()      size = 2480, offset = 39985152
(...)
client3_1_ftruncate()   offset = 39987632

Why does it decides to set truncate_needed? fsi->valid = 0x78 means this is
set: | FATTR_FH | FATTR_SIZE

Here is the offending code:

#define FATTR_MASK   (FATTR_SIZE                        \
                      | FATTR_UID | FATTR_GID           \
                      | FATTR_ATIME | FATTR_MTIME       \
                      | FATTR_MODE)
(...)
        if ((fsi->valid & (FATTR_MASK)) != FATTR_SIZE) { 
                if (fsi->valid & FATTR_SIZE) { 
                        state->size            = fsi->size;
                        state->truncate_needed = _gf_true;
                }

The sin is therefore to set FATTR_ATIME | FATTR_MTIME, while glusterfs
assumes this is a ftruncate() calls because only FATTR_SIZE is set. Am I
correct?


> Let me know if this line of debugging helps. I need to understand the 
> details of the conversion of ftruncate() to FUSE SETATTR. A pointer to 
> the corresponding NetBSD code will help.

That happens in the kernel.
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/kern/vfs_syscalls.c?rev=1.431

in sys_ftruncate() 
        if (vp->v_type == VDIR)
                error = EISDIR;
        else if ((error = vn_writechk(vp)) == 0) {
                vattr_null(&vattr);
                vattr.va_size = SCARG(uap, length);
                error = VOP_SETATTR(vp, &vattr, fp->f_cred);
        }

VOP_SETATTR() is the vnode method. It will eventually turn into
FUSE_SETATTR. glusterfs will convert it back to a ftruncate in fuse_setattr
() and fuse_setattr_cbk() from fuse-bridge.c.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org




More information about the Gluster-devel mailing list