[Bugs] [Bug 1331254] New: Disperse volume fails on high load and logs show some assertion failures
bugzilla at redhat.com
bugzilla at redhat.com
Thu Apr 28 06:30:26 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1331254
Bug ID: 1331254
Summary: Disperse volume fails on high load and logs show some
assertion failures
Product: GlusterFS
Version: mainline
Component: disperse
Keywords: Triaged
Severity: high
Assignee: bugs at gluster.org
Reporter: xhernandez at datalab.es
CC: aspandey at redhat.com, bugs at gluster.org,
pkarampu at redhat.com
Depends On: 1330132
Blocks: 1330997
+++ This bug was initially created as a clone of Bug #1330132 +++
Description of problem:
A distributed iozone test over multiple NFS mounts on different machines causes
the test to fail and some assertion failures appear on the logs:
[2016-04-21 19:29:58.096645] E [ec-inode-read.c:1157:ec_readv_rebuild]
(-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(__ec_manager+0x5b)
[0x7f9e4e8f18bb]
-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(ec_manager_readv+0x107)
[0x7f9e4e908197]
-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(ec_readv_rebuild+0x236)
[0x7f9e4e907f26] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->fd->inode,
&cbk->iatt[0].ia_size)
[2016-04-21 19:29:58.126547] E [ec-common.c:1641:ec_lock_unfreeze]
(-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(ec_manager_inodelk+0x155)
[0x7f9e4e8fc305]
-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(ec_unlocked+0x35)
[0x7f9e4e8f3c25]
-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(ec_lock_unfreeze+0x100)
[0x7f9e4e8f3ab0] ) 0-: Assertion failed: list_empty(&lock->waiting) &&
list_empty(&lock->owners)
[2016-04-21 19:30:05.998568] E [ec-inode-read.c:1612:ec_manager_stat]
(-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(ec_resume+0x88)
[0x7f9e4e8f1a68]
-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(__ec_manager+0x5b)
[0x7f9e4e8f18bb]
-->/usr/lib64/glusterfs/3.7.10/xlator/cluster/disperse.so(ec_manager_stat+0x315)
[0x7f9e4e905ed5] ) 0-: Assertion failed: ec_get_inode_size(fop,
fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-04-21 19:30:05.999146] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-8: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999132] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-10: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999237] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-11: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999259] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-7: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999326] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-9: remote
operation failed [Invalid argument]
[2016-04-21 19:30:06.047496] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-6: remote
operation failed [Invalid argument]
[2016-04-21 19:30:06.047559] W [MSGID: 122015] [ec-common.c:1675:ec_unlocked]
0-test-disperse-1: entry/inode unlocking failed (FSTAT) [Invalid argument]
Version-Release number of selected component (if applicable): mainline
How reproducible:
It happens randomly after some time running the distributed iozone test.
Steps to Reproduce:
1.
2.
3.
Actual results:
Volume access fails and iozone quits with an error.
Expected results:
iozone should complete the test successfully.
Additional info:
Probably related to a race when cancelling the lock release timeout while the
callback is already executing. In this case the new fop is not placed in the
right waiting list.
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1330132
[Bug 1330132] Disperse volume fails on high load and logs show some
assertion failures
https://bugzilla.redhat.com/show_bug.cgi?id=1330997
[Bug 1330997] [NFS-Ganesha]: IO hang seen on ganesha mount with file ops
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list