[Gluster-devel] GF_PARENT_DOWN on SIGKILL

Fri Jul 22 13:13:15 UTC 2016

http://review.gluster.org/14980, this is where we have all the context
about why I sent out this mail. Basically the tests were failing because
umount is racing with version-updation xattrop. While I fixed the test to
handle that race, xavi was wondering why GF_PARENT_DOWN event didn't come.
I found that in cleanup_and_exit() we don't send this event. We are only
calling 'fini()'. So wondering if any one knows why this is so.

On Fri, Jul 22, 2016 at 6:37 PM, Pranith Kumar Karampuri <
pkarampu at redhat.com> wrote:

> It is only calling fini() apart from that not much.
>
> On Fri, Jul 22, 2016 at 6:36 PM, Pranith Kumar Karampuri <
> pkarampu at redhat.com> wrote:
>
>> Gah! sorry sorry, I meant to send the mail as SIGTERM. Not SIGKILL. So
>> xavi and I were wondering why cleanup_and_exit() is not sending
>> GF_PARENT_DOWN event.
>>
>> On Fri, Jul 22, 2016 at 6:24 PM, Jeff Darcy <jdarcy at redhat.com> wrote:
>>
>>> > Does anyone know why GF_PARENT_DOWN is not triggered on SIGKILL? It
>>> will give
>>> > a chance for xlators to do any cleanup they need to do. For example ec
>>> can
>>> > complete the delayed xattrops.
>>>
>>> Nothing is triggered on SIGKILL.  SIGKILL is explicitly defined to
>>> terminate a
>>> process *immediately*.  Among other things, this means it can not be
>>> ignored or
>>> caught, to preclude handlers doing something that might delay
>>> termination.
>>>
>>>
>>> http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04
>>>
>>> Since at least 4.2BSD and SVr2 (the first version of UNIX that I worked
>>> on)
>>> there have even been distinct kernel code paths to ensure special
>>> handling of
>>> SIGKILL.  There's nothing we can do about SIGKILL except be prepared to
>>> deal
>>> with it the same way we'd deal with the entire machine crashing.
>>>
>>> If you mean why is there nothing we can do on a *server* in response to
>>> SIGKILL on a *client*, that's a slightly more interesting question.  It's
>>> possible that the unique nature of SIGKILL puts connections into a
>>> different state than either system failure (on the more abrupt side) or
>>> clean shutdown (less abrupt).  If so, we probably need to take a look at
>>> the socket/RPC code or perhaps even protocol/server to see why these
>>> connections are not being cleaned up and shut down in a timely fashion.
>>>
>>
>>
>>
>> --
>> Pranith
>>
>
>
>
> --
> Pranith
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160722/870ceffb/attachment.html>