[Bugs] [Bug 1671556] glusterfs FUSE client crashing every few days with 'Failed to dispatch handler'

Fri Feb 8 03:03:20 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1671556

--- Comment #11 from Nithya Balachandran <nbalacha at redhat.com> ---
(In reply to David E. Smith from comment #10)
> Ran a couple of the glusterfs logs through the print-backtrace script. They
> all start with what you'd normally expect (clone, start_thread) and all end
> with (_gf_msg_backtrace_nomem) but they're all doing different things in the
> middle. It looks sorta like a memory leak or other memory corruption. Since
> it started happening on both of my servers after upgrading to 5.2 (and
> continued with 5.3), I really doubt it's a hardware issue -- the FUSE
> clients are both VMs, on hosts a few miles apart, so the odds of host RAM
> going wonky in both places at exactly that same time are ridiculous.
> 
> Bit of a stretch, but do you think there would be value in my rebuilding the
> RPMs locally, to try to rule out anything on CentOS' end?

I don't think so. My guess is there is an error somewhere in the client code
when handling inodes. It was never hit earlier because we never freed the
inodes before 5.3. With the new inode invalidation feature, we appear to be
accessing inodes that were already freed.

Did you see the same crashes in 5.2? If yes, something else might be going
wrong.

I had a look at the coredumps you sent - most don't have any symbols
(strangely). Of the ones that do, it looks like memory corruption and accessing
already freed inodes. There are a few people looking at it but this going to
take a while to figure out. In the meantime, let me know if you still see
crashes with the lru-limit option.

-- 
You are receiving this mail because:
You are on the CC list for the bug.