[Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

Emmanuel Dreyfus manu at netbsd.org
Fri Jun 19 04:56:04 UTC 2015


Emmanuel Dreyfus <manu at netbsd.org> wrote:

> This means the dd process getting stuck in tstile because glusterfsd
> died is probably a NetBSD kernel bug. I have to investigate. 

I think I found the culprit, but fixing this will need some discussions
on NetBSD lists:

dd waits on a vnode lock owned by the ioflush kernel thread, which is
responsible of periodical fsync.

ioflush is stuck on the following backtrace:
cv_wait
genfs_do_putpages
genfs_putpages
VOP_PUTPAGES
nfs_flush
nfs_fsync
VOP_FSYNC
nfs_sync
sync_fsync

The cv_wait() call in genfs_do_putpages():
        /* Wait for output to complete. */
        if (!wasclean && !async && vp->v_numoutput != 0) {
                while (vp->v_numoutput != 0)
                        cv_wait(&vp->v_cv, slock);
        }

cv_wait() is uninterruptible, timeout-less wait which is obviously wrong
there. cv_timedwait_sig() would be better, but that means pulling NFS
mount options from a lower layer. Not obvious on the architecture front.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org


More information about the Gluster-devel mailing list