[Bugs] [Bug 1671556] glusterfs FUSE client crashing every few days with 'Failed to dispatch handler'

Wed Feb 6 07:23:49 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1671556

--- Comment #8 from Nithya Balachandran <nbalacha at redhat.com> ---
Initial analysis of one of the cores:

[root at rhgs313-7 gluster-5.3]# gdb -c core.6014 /usr/sbin/glusterfs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfs --direct-io-mode=disable
--fuse-mountopts=noatime,context="'.

Program terminated with signal 11, Segmentation fault.
#0  __inode_ctx_free (inode=inode at entry=0x7fa0d0349af8) at inode.c:410
410                 if (!xl->call_cleanup && xl->cbks->forget)

(gdb) bt
#0  __inode_ctx_free (inode=inode at entry=0x7fa0d0349af8) at inode.c:410
#1  0x00007fa1809e90a2 in __inode_destroy (inode=0x7fa0d0349af8) at inode.c:432
#2  inode_table_prune (table=table at entry=0x7fa15800c3c0) at inode.c:1696
#3  0x00007fa1809e9f96 in inode_forget_with_unref (inode=0x7fa0d0349af8,
nlookup=128) at inode.c:1273
#4  0x00007fa177dae4e1 in do_forget (this=<optimized out>, unique=<optimized
out>, nodeid=<optimized out>, nlookup=<optimized out>) at fuse-bridge.c:726
#5  0x00007fa177dae5bd in fuse_forget (this=<optimized out>,
finh=0x7fa0a41da500, msg=<optimized out>, iobuf=<optimized out>) at
fuse-bridge.c:741
#6  0x00007fa177dc5d7a in fuse_thread_proc (data=0x557a0e8ffe20) at
fuse-bridge.c:5125
#7  0x00007fa17f83bdd5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fa17f103ead in msync () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()
(gdb) f 0
#0  __inode_ctx_free (inode=inode at entry=0x7fa0d0349af8) at inode.c:410
410                 if (!xl->call_cleanup && xl->cbks->forget)
(gdb) l
405         for (index = 0; index < inode->table->xl->graph->xl_count; index++)
{
406             if (inode->_ctx[index].value1 || inode->_ctx[index].value2) {
407                 xl = (xlator_t *)(long)inode->_ctx[index].xl_key;
408                 old_THIS = THIS;
409                 THIS = xl;
410                 if (!xl->call_cleanup && xl->cbks->forget)
411                     xl->cbks->forget(xl, inode);
412                 THIS = old_THIS;
413             }
414         }
(gdb) p *xl
Cannot access memory at address 0x0

(gdb) p index
$1 = 6

(gdb) p inode->table->xl->graph->xl_count
$3 = 13
(gdb) p inode->_ctx[index].value1
$4 = 0
(gdb) p inode->_ctx[index].value2
$5 = 140327960119304
(gdb) p/x inode->_ctx[index].value2
$6 = 0x7fa0a6370808

Based on the graph, the xlator with index = 6 is
(gdb) p ((xlator_t*) 
inode->table->xl->graph->top)->next->next->next->next->next->next->next->name
$31 = 0x7fa16c0122e0 "web-content-read-ahead"
(gdb) p ((xlator_t*) 
inode->table->xl->graph->top)->next->next->next->next->next->next->next->xl_id
$32 = 6

But read-ahead does not update the inode_ctx at all. There seems to be some
sort of memory corruption happening here but that needs further analysis.

-- 
You are receiving this mail because:
You are on the CC list for the bug.