[Bugs] [Bug 1422769] brick process crashes when glusterd is restarted

Thu Mar 9 13:51:48 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1422769

--- Comment #8 from Jeff Darcy <jdarcy at redhat.com> ---
I still have no way to reproduce this, nor do I have access to the RPMs
associated with the core in the referenced sosreports, so my ability to debug
this is quite hampered.  However, I do see in the logs that all bricks
terminated with the same "Exhausted all volfile servers" message that was seen
(on clients) in bug 1422781.  This means that the brick daemons terminated with
glusterd, and had to be restarted when glusterd was.  The crash seems to be a
result of getting an attach request for a second brick before the first was
ready (setting ctx->active).  This is highly reminiscent of bug 1430138, which
perhaps shouldn't be surprising since that was found while testing the fix for
1422781.

The fix for 1422781 also affects servers, and should prevent the
terminate/restart that leads to this bug.  On the other hand, it also wouldn't
hurt to add a null check in glusterfs_handle_attach and/or
glusterfs_graph_attach, to reduce the "blast area" in other cases where an
attach request might be received before we're ready.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=UF52ywVMYN&a=cc_unsubscribe