[Bugs] [Bug 1665656] testcaes glusterd/add-brick-and-validate-replicated-volume-options.t is crash while brick_mux is enable

bugzilla at redhat.com bugzilla at redhat.com
Sat Jan 12 05:46:10 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1665656



--- Comment #1 from Mohit Agrawal <moagrawa at redhat.com> ---
Hi,

test case is generating below crash after just call kill_brick.

#0  0x0000560df20e821b in STACK_DESTROY (stack=0x3) at
../../libglusterfs/src/glusterfs/stack.h:182
182         LOCK(&stack->pool->lock);
Missing separate debuginfos, use: debuginfo-install
bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.168-8.el7.x86_64
elfutils-libs-0.168-8.el7.x86_64 glibc-2.17-196.el7_4.2.x86_64
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64
libacl-2.2.51-12.el7.x86_64 libaio-0.3.109-13.el7.x86_64
libattr-2.4.46-12.el7.x86_64 libcap-2.22-9.el7.x86_64
libcom_err-1.42.9-10.el7.x86_64 libgcc-4.8.5-16.el7_4.2.x86_64
libselinux-2.5-11.el7.x86_64 libuuid-2.23.2-43.el7_4.2.x86_64
openssl-libs-1.0.2k-8.el7.x86_64 pcre-8.32-17.el7.x86_64
systemd-libs-219-42.el7_4.10.x86_64 xz-libs-5.2.2-1.el7.x86_64
zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x0000560df20e821b in STACK_DESTROY (stack=0x3) at
../../libglusterfs/src/glusterfs/stack.h:182
#1  mgmt_pmap_signin_cbk (req=<optimized out>, iov=<optimized out>,
count=<optimized out>, myframe=0x7fda6802ddb8)
    at glusterfsd-mgmt.c:2824
#2  0x00007fda86559161 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x560df2d53b30, pollin=pollin at entry=0x560df2ea98f0)
    at rpc-clnt.c:755
#3  0x00007fda865594c7 in rpc_clnt_notify (trans=0x560df2d53e50,
mydata=0x560df2d53b60, event=<optimized out>, 
    data=0x560df2ea98f0) at rpc-clnt.c:922
#4  0x00007fda86555b33 in rpc_transport_notify (this=this at entry=0x560df2d53e50,
event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data at entry=0x560df2ea98f0) at rpc-transport.c:541
#5  0x00007fda7ab7f95d in socket_event_poll_in (notify_handled=true,
this=0x560df2d53e50) at socket.c:2516
#6  socket_event_handler (fd=<optimized out>, idx=<optimized out>,
gen=<optimized out>, data=0x560df2d53e50, 
    poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0,
event_thread_died=0 '\000') at socket.c:2918
#7  0x00007fda86814e15 in event_dispatch_epoll_handler (event=0x7fda34ff8e70,
event_pool=0x560df2d03560) at event-epoll.c:642
#8  event_dispatch_epoll_worker (data=0x7fda40054740) at event-epoll.c:756
#9  0x00007fda855eee25 in start_thread () from /usr/lib64/libpthread.so.0
#10 0x00007fda84ebb34d in clone () from /usr/lib64/libc.so.6
(gdb) f 1
#1  mgmt_pmap_signin_cbk (req=<optimized out>, iov=<optimized out>,
count=<optimized out>, myframe=0x7fda6802ddb8)
    at glusterfsd-mgmt.c:2824
2824        STACK_DESTROY(frame->root);
(gdb) p frame
$1 = (call_frame_t *) 0x7fda6802ddb8
(gdb) p *frame
$2 = {root = 0x4, parent = 0x400000001, frames = {next = 0xffffffffffffffff,
prev = 0x7fda6802de18}, local = 0x7fda68059478, 
  this = 0x0, ret = 0x0, ref_count = 0, lock = {spinlock = 0, mutex = {__data =
{__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev
= 0x0, __next = 0x7fda68059478}}, 
      __size = '\000' <repeats 32 times>, "x\224\005h\332\177\000", __align =
0}}, cookie = 0x0, complete = 232, op = 32730, 
  begin = {tv_sec = 0, tv_nsec = 140576024633944}, end = {tv_sec =
140576024654704, tv_nsec = 1125216510}, 
  wind_from = 0x1 <Address 0x1 out of bounds>, wind_to = 0x0, unwind_from =
0x0, unwind_to = 0x0}
(gdb) p frame->root
$3 = (call_stack_t *) 0x4
(gdb) 

After checked the code I have found the current glusterfs_mgmt_pmap_signin code
is not perfect to send signin request. It uses same frame to send multiple
requests.

>>>>>>>
.......
.......

if (ctx->active) {
        top = ctx->active->first;
        for (trav_p = &top->children; *trav_p; trav_p = &(*trav_p)->next) {
            req.brick = (*trav_p)->xlator->name;
            ret = mgmt_submit_request(&req, frame, ctx, &clnt_pmap_prog,
                                      GF_PMAP_SIGNIN, mgmt_pmap_signin_cbk,
                                      (xdrproc_t)xdr_pmap_signin_req);
            if (ret < 0) {
                gf_log(THIS->name, GF_LOG_WARNING,
                       "failed to send sign in request; brick = %s",
req.brick);
            }
            count++;
        }
    } else {
        ret = mgmt_submit_request(&req, frame, ctx, &clnt_pmap_prog,
                                  GF_PMAP_SIGNIN, mgmt_pmap_signin_cbk,
                                  (xdrproc_t)xdr_pmap_signin_req);
    }

>>>>>>>>>>>>>

Thanks,
Mohit Agrawal

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list