[Gluster-devel] glusterfs coredump--mempool

Tue May 21 07:12:28 UTC 2019

Hi glusterfs expert,
I meet glusterfs process coredump again in my env, short after glusterfs process startup. The local become NULL, but seems this frame is not destroyed yet since the magic number(GF_MEM_HEADER_MAGIC) still untouched.
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfs --acl --volfile-server=mn-0.local --volfile-server=mn-1.loc'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f867fcd2971 in client3_3_inodelk_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f8654008830)
    at client-rpc-fops.c:1510
1510          CLIENT_STACK_UNWIND (inodelk, frame, rsp.op_ret,
[Current thread is 1 (Thread 0x7f867d6d4700 (LWP 3046))]
Missing separate debuginfos, use: dnf debuginfo-install glusterfs-fuse-3.12.15-1.wos2.wf29.x86_64
(gdb) bt
#0  0x00007f867fcd2971 in client3_3_inodelk_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f8654008830)
    at client-rpc-fops.c:1510
#1  0x00007f8685ea5584 in rpc_clnt_handle_reply (clnt=clnt at entry=0x7f8678070030, pollin=pollin at entry=0x7f86702833e0) at rpc-clnt.c:782
#2  0x00007f8685ea587b in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f8678070060, event=<optimized out>, data=0x7f86702833e0) at rpc-clnt.c:975
#3  0x00007f8685ea1b83 in rpc_transport_notify (this=this at entry=0x7f8678070270, event=event at entry=RPC_TRANSPORT_MSG_RECEIVED,
    data=data at entry=0x7f86702833e0) at rpc-transport.c:538
#4  0x00007f8680b99867 in socket_event_poll_in (notify_handled=_gf_true, this=0x7f8678070270) at socket.c:2260
#5  socket_event_handler (fd=<optimized out>, idx=3, gen=1, data=0x7f8678070270, poll_in=<optimized out>, poll_out=<optimized out>,
    poll_err=<optimized out>) at socket.c:2645
#6  0x00007f8686132911 in event_dispatch_epoll_handler (event=0x7f867d6d3e6c, event_pool=0x55e1b2792b00) at event-epoll.c:583
#7  event_dispatch_epoll_worker (data=0x7f867805ece0) at event-epoll.c:659
#8  0x00007f8684ea65da in start_thread () from /lib64/libpthread.so.0
#9  0x00007f868474eeaf in clone () from /lib64/libc.so.6
(gdb) print *(call_frame_t*)myframe
$3 = {root = 0x7f86540271a0, parent = 0x0, frames = {next = 0x7f8654027898, prev = 0x7f8654027898}, local = 0x0, this = 0x7f8678013080, ret = 0x0,
  ref_count = 0, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0,
        __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, cookie = 0x0, complete = _gf_false, xid = 0,
  op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x0, wind_to = 0x0, unwind_from = 0x0, unwind_to = 0x0}
(gdb) x/4xw  0x7f8654008810
0x7f8654008810:   0xcafebabe 0x00000000 0x00000000 0x00000000
(gdb) p *(pooled_obj_hdr_t *)0x7f8654008810
$2 = {magic = 3405691582, next = 0x0, pool_list = 0x7f8654000b80, power_of_two = 8}

I add "uint32_t xid" in data structure _call_frame, and set it according to the rcpreq->xid in __save_frame function. In normal situation this xid should only be 0 immediately after create_frame from memory pool. But in this case this xid is 0, so seems like that the frame has been given out for use before freed. Have you any idea how this happen?

cynthia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190521/eb1bfc71/attachment.html>