[Bugs] [Bug 1331254] New: Disperse volume fails on high load and logs show some assertion failures

bugzilla at redhat.com bugzilla at redhat.com
Thu Apr 28 06:30:26 UTC 2016


            Bug ID: 1331254
           Summary: Disperse volume fails on high load and logs show some
                    assertion failures
           Product: GlusterFS
           Version: mainline
         Component: disperse
          Keywords: Triaged
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: xhernandez at datalab.es
                CC: aspandey at redhat.com, bugs at gluster.org,
                    pkarampu at redhat.com
        Depends On: 1330132
            Blocks: 1330997

+++ This bug was initially created as a clone of Bug #1330132 +++

Description of problem:

A distributed iozone test over multiple NFS mounts on different machines causes
the test to fail and some assertion failures appear on the logs:

[2016-04-21 19:29:58.096645] E [ec-inode-read.c:1157:ec_readv_rebuild]
[0x7f9e4e907f26] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->fd->inode,
[2016-04-21 19:29:58.126547] E [ec-common.c:1641:ec_lock_unfreeze]
[0x7f9e4e8f3ab0] ) 0-: Assertion failed: list_empty(&lock->waiting) &&
[2016-04-21 19:30:05.998568] E [ec-inode-read.c:1612:ec_manager_stat]
[0x7f9e4e905ed5] ) 0-: Assertion failed: ec_get_inode_size(fop,
fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-04-21 19:30:05.999146] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-8: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999132] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-10: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999237] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-11: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999259] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-7: remote
operation failed [Invalid argument]
[2016-04-21 19:30:05.999326] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-9: remote
operation failed [Invalid argument]
[2016-04-21 19:30:06.047496] E [MSGID: 114031]
[client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-test-client-6: remote
operation failed [Invalid argument]
[2016-04-21 19:30:06.047559] W [MSGID: 122015] [ec-common.c:1675:ec_unlocked]
0-test-disperse-1: entry/inode unlocking failed (FSTAT) [Invalid argument]

Version-Release number of selected component (if applicable): mainline

How reproducible:

It happens randomly after some time running the distributed iozone test.

Steps to Reproduce:

Actual results:

Volume access fails and iozone quits with an error.

Expected results:

iozone should complete the test successfully.

Additional info:

Probably related to a race when cancelling the lock release timeout while the
callback is already executing. In this case the new fop is not placed in the
right waiting list.

Referenced Bugs:

[Bug 1330132] Disperse volume fails on high load and logs show some
assertion failures
[Bug 1330997] [NFS-Ganesha]: IO hang seen on ganesha mount with file ops
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.

More information about the Bugs mailing list