[Bugs] [Bug 1341452] New: gluster coredump when multiple CLI command called at unstable network

bugzilla at redhat.com bugzilla at redhat.com
Wed Jun 1 05:52:47 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1341452

            Bug ID: 1341452
           Summary: gluster coredump when multiple CLI command called at
                    unstable network
           Product: GlusterFS
           Version: 3.6.9
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: george.lian at nokia.com
                CC: bugs at gluster.org



Description of problem:

glusterd failed with coredump 
Version-Release number of selected component (if applicable):


How reproducible:

run gluster volume heal vol_name info during the network is not stable
Steps to Reproduce:
1.repeat run CLI command "gluster volume heal vol_name info"
2.let network unstable during the replicated VMs.
3.sometimes glsuter failed with coredump

Actual results:


Expected results:
glusterd exit with failed and cordump output.
the backtrace of coredump list as the below:

#0  0x00007f5c9dd38177 in __GI_raise (sig=sig at entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f5c9b16d700 (LWP 5032))]
(gdb) bt
#0  0x00007f5c9dd38177 in __GI_raise (sig=sig at entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007f5c9dd395fa in __GI_abort () at abort.c:89
#2  0x00007f5c9dd3115d in __assert_fail_base (fmt=0x7f5c9de68768 "%s%s%s:%u:
%s%sAssertion `%s' failed.\n%n", assertion=assertion at entry=0x7f5c9a7397c8
"iov",
    file=file at entry=0x7f5c9a73970c "glusterd-syncop.c", line=line at entry=417,
function=function at entry=0x7f5c9a73a080 "gd_syncop_mgmt_v3_unlock_cbk_fn")
    at assert.c:92
#3  0x00007f5c9dd31212 in __GI___assert_fail (assertion=0x7f5c9a7397c8 "iov",
file=0x7f5c9a73970c "glusterd-syncop.c", line=417,
    function=0x7f5c9a73a080 "gd_syncop_mgmt_v3_unlock_cbk_fn") at assert.c:101
#4  0x00007f5c9a6e3a22 in gd_syncop_mgmt_v3_unlock_cbk_fn () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#5  0x00007f5c9a68c3a8 in glusterd_big_locked_cbk () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#6  0x00007f5c9a6e3b8d in gd_syncop_mgmt_v3_unlock_cbk () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#7  0x00007f5c9eb498ed in rpc_clnt_submit () from /usr/lib64/libgfrpc.so.0
#8  0x00007f5c9a6e331a in gd_syncop_submit_request () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#9  0x00007f5c9a6e3cfc in gd_syncop_mgmt_v3_unlock () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#10 0x00007f5c9a6e618f in gd_unlock_op_phase () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#11 0x00007f5c9a6e6dd6 in gd_sync_task_begin () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#12 0x00007f5c9a6e6f47 in glusterd_op_begin_synctask () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#13 0x00007f5c9a646ede in __glusterd_handle_set_volume () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#14 0x00007f5c9a6424ec in glusterd_big_locked_handler () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#15 0x00007f5c9a646fc9 in glusterd_handle_set_volume () from
/usr/lib64/glusterfs/3.6.9/xlator/mgmt/glusterd.so
#16 0x00007f5c9edac062 in synctask_wrap () from /usr/lib64/libglusterfs.so.0
#17 0x00007f5c9dd48ee0 in ?? () from /lib64/libc.so.6
#18 0x0000000000000000 in ?? ()



Additional info:
After do some investigation with the error log and backtrace of coredump, the
root cause seems clear as the below FYI:

1) found there a warning log with :
[2016-05-31 04:30:01.400773] W [rpc-clnt.c:1562:rpc_clnt_submit] 0-management:
failed to submit rpc-request (XID: 0x9 Program: glusterd mgmt v3, ProgVers: 3, 
      Proc: 6) to rpc-transport (management)

2) when rpc_clnt_submit failed with some reason, the function will enter the
below line (file  rpc-clnt.c)
01597                         cbkfn (rpcreq, NULL, 0, frame);

3) the cbkfun is gd_syncop_mgmt_v3_unlock_cbk_fn
and in source code file "glusterd_syncop.c" line
0417         GF_ASSERT(iov);

4) iov is the second parameter which called by cbkfn with NULL, so coredump
happen.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list