[Bugs] [Bug 1671556] glusterfs FUSE client crashing every few days with 'Failed to dispatch handler'

bugzilla at redhat.com bugzilla at redhat.com
Thu Feb 7 17:58:26 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1671556



--- Comment #10 from David E. Smith <desmith at wustl.edu> ---
Ran a couple of the glusterfs logs through the print-backtrace script. They all
start with what you'd normally expect (clone, start_thread) and all end with
(_gf_msg_backtrace_nomem) but they're all doing different things in the middle.
It looks sorta like a memory leak or other memory corruption. Since it started
happening on both of my servers after upgrading to 5.2 (and continued with
5.3), I really doubt it's a hardware issue -- the FUSE clients are both VMs, on
hosts a few miles apart, so the odds of host RAM going wonky in both places at
exactly that same time are ridiculous.

Bit of a stretch, but do you think there would be value in my rebuilding the
RPMs locally, to try to rule out anything on CentOS' end?


/lib64/libglusterfs.so.0(+0x26610)[0x7fa1809d8610] _gf_msg_backtrace_nomem ??:0
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7fa1809e2b84] gf_print_trace
??:0
/lib64/libc.so.6(+0x36280)[0x7fa17f03c280] __restore_rt ??:0
/lib64/libglusterfs.so.0(+0x3586d)[0x7fa1809e786d] __inode_ctx_free ??:0
/lib64/libglusterfs.so.0(+0x370a2)[0x7fa1809e90a2] inode_table_prune ??:0
/lib64/libglusterfs.so.0(inode_forget_with_unref+0x46)[0x7fa1809e9f96]
inode_forget_with_unref ??:0
/usr/lib64/glusterfs/5.3/xlator/mount/fuse.so(+0x85bd)[0x7fa177dae5bd]
fuse_forget ??:0
/usr/lib64/glusterfs/5.3/xlator/mount/fuse.so(+0x1fd7a)[0x7fa177dc5d7a]
fuse_thread_proc ??:0
/lib64/libpthread.so.0(+0x7dd5)[0x7fa17f83bdd5] start_thread ??:0
/lib64/libc.so.6(clone+0x6d)[0x7fa17f103ead] __clone ??:0


/lib64/libglusterfs.so.0(+0x26610)[0x7f36aff72610] _gf_msg_backtrace_nomem ??:0
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f36aff7cb84] gf_print_trace
??:0
/lib64/libc.so.6(+0x36280)[0x7f36ae5d6280] __restore_rt ??:0
/lib64/libglusterfs.so.0(+0x36779)[0x7f36aff82779] __inode_unref ??:0
/lib64/libglusterfs.so.0(inode_unref+0x23)[0x7f36aff83203] inode_unref ??:0
/lib64/libglusterfs.so.0(gf_dirent_entry_free+0x2b)[0x7f36aff9ec4b]
gf_dirent_entry_free ??:0
/lib64/libglusterfs.so.0(gf_dirent_free+0x2b)[0x7f36aff9ecab] gf_dirent_free
??:0
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x7480)[0x7f36a215b480]
afr_readdir_cbk ??:0
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x60bca)[0x7f36a244dbca]
client4_0_readdirp_cbk ??:0
/lib64/libgfrpc.so.0(+0xec70)[0x7f36afd3ec70] rpc_clnt_handle_reply ??:0
/lib64/libgfrpc.so.0(+0xf043)[0x7f36afd3f043] rpc_clnt_notify ??:0
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f36afd3af23]
rpc_transport_notify ??:0
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa37b)[0x7f36a492737b]
socket_event_handler ??:0
/lib64/libglusterfs.so.0(+0x8aa49)[0x7f36affd6a49] event_dispatch_epoll_worker
??:0
/lib64/libpthread.so.0(+0x7dd5)[0x7f36aedd5dd5] start_thread ??:0
/lib64/libc.so.6(clone+0x6d)[0x7f36ae69dead] __clone ??:0


/lib64/libglusterfs.so.0(+0x26610)[0x7f7e13de0610] _gf_msg_backtrace_nomem ??:0
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f7e13deab84] gf_print_trace
??:0
/lib64/libc.so.6(+0x36280)[0x7f7e12444280] __restore_rt ??:0
/lib64/libpthread.so.0(pthread_mutex_lock+0x0)[0x7f7e12c45c30]
pthread_mutex_lock ??:0
/lib64/libglusterfs.so.0(__gf_free+0x12c)[0x7f7e13e0bc3c] __gf_free ??:0
/lib64/libglusterfs.so.0(+0x368ed)[0x7f7e13df08ed] __dentry_unset ??:0
/lib64/libglusterfs.so.0(+0x36b2b)[0x7f7e13df0b2b] __inode_retire ??:0
/lib64/libglusterfs.so.0(+0x36885)[0x7f7e13df0885] __inode_unref ??:0
/lib64/libglusterfs.so.0(inode_forget_with_unref+0x36)[0x7f7e13df1f86]
inode_forget_with_unref ??:0
/usr/lib64/glusterfs/5.3/xlator/mount/fuse.so(+0x857a)[0x7f7e0b1b657a]
fuse_batch_forget ??:0
/usr/lib64/glusterfs/5.3/xlator/mount/fuse.so(+0x1fd7a)[0x7f7e0b1cdd7a]
fuse_thread_proc ??:0
/lib64/libpthread.so.0(+0x7dd5)[0x7f7e12c43dd5] start_thread ??:0
/lib64/libc.so.6(clone+0x6d)[0x7f7e1250bead] __clone ??:0

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list