[Gluster-devel] mem_put without a xlator_t

Mon Apr 27 08:23:52 UTC 2015

Hi

I obseve a crash when rebalance process cannot start in tiered setup.
It happens because the cleanup process calls mem_put() while xlator_t
is not valid anymore. Here is the backtrace:

#0  0xbb69e23d in pthread_spin_lock () from /usr/lib/libpthread.so.1
#1  0xbb78a066 in __gf_free (free_ptr=0xb9a510f0) at mem-pool.c:303
#2  0xbb78a83c in mem_put (ptr=0xb9a51100) at mem-pool.c:570
#3  0xbb74fe97 in log_buf_destroy (buf=0xb9a51100) at logging.c:357
#4  0xbb752eb5 in gf_log_flush_list (copy=0xbf7fe054, ctx=0xbb109000) at logging.c:1703
#5  0xbb7530c6 in gf_log_flush_extra_msgs (ctx=0xbb109000, new=0) at logging.c:1769
#6  0xbb74fc73 in gf_log_set_log_buf_size (buf_size=0) at logging.c:270
#7  0xbb75010c in gf_log_disable_suppression_before_exit (ctx=0xbb109000) at logging.c:437
#8  0x0804e526 in cleanup_and_exit (signum=0) at glusterfsd.c:1210
#9  0x08050797 in glusterfs_process_volfp (ctx=0xbb109000, fp=0xbb4a8f70) at glusterfsd.c:2176
#10 0x08054bb6 in mgmt_getspec_cbk (req=0xbb11d440, iov=0xbb11d460, count=1, myframe=0xbb105e88) at glusterfsd-mgmt.c:1552
#11 0xbb7251dc in rpc_clnt_handle_reply () from /autobuild/install/lib/libgfrpc.so.0
#12 0xbb7254d4 in rpc_clnt_notify () from /autobuild/install/lib/libgfrpc.so.0
#13 0xbb721ba3 in rpc_transport_notify () from /autobuild/install/lib/libgfrpc.so.0
#14 0xbb28dafc in socket_event_poll_in () from /autobuild/install/lib/glusterfs/3.8dev/rpc-transport/socket.so
#15 0xbb28dfb1 in socket_event_handler () from /autobuild/install/lib/glusterfs/3.8dev/rpc-transport/socket.so
#16 0xbb7b7538 in event_dispatch_poll_handler (event_pool=0xbb143030, ufds=0xb9a080b0, i=2) at event-poll.c:391
#17 0xbb7b785c in event_dispatch_poll (event_pool=0xbb143030) at event-poll.c:487
#18 0xbb789032 in event_dispatch (event_pool=0xbb143030) at event.c:127
#19 0x08050c20 in main (argc=33, argv=0xbf7fe7a0) at glusterfsd.c:2313

(gdb) frame 1
#1  0xbb78a066 in __gf_free (free_ptr=0xb9a510f0) at mem-pool.c:303
303             LOCK (&xl->mem_acct.rec[header->type].lock);
(gdb) print *xl
$4 = {name = 0xadc0de00 <error: Cannot access memory at address 0xadc0de00>, 
  type = 0x7e3cc0de <error: Cannot access memory at address 0x7e3cc0de>,
(and so on)

This problem is a consequence of another bug that prevents glusterfsd from
starting. I will focus on that root bug and leave this crash on exit for
someone that is familiar enough with the code to see the obvious fix.

-- 
Emmanuel Dreyfus
manu at netbsd.org