[Gluster-devel] gNFS service management from glusterd

Fri Feb 23 21:04:49 UTC 2018

On Wed, Feb 21, 2018 at 08:25:21PM +0530, Atin Mukherjee wrote:
> On Wed, Feb 21, 2018 at 4:24 PM, Xavi Hernandez <jahernan at redhat.com> wrote:
> 
> > Hi all,
> >
> > currently glusterd sends a SIGKILL to stop gNFS, while all other services
> > are stopped with a SIGTERM signal first (this can be seen in
> > glusterd_svc_stop() function of mgmt/glusterd xlator).
> >
> 
> > The question is why it cannot be stopped with SIGTERM as all other
> > services. Using SIGKILL blindly while write I/O is happening can cause
> > multiple inconsistencies at the same time. For a replicated volume this is
> > not a problem because it will take one of the replicas as the "good" one
> > and continue, but for a disperse volume, if the number of inconsistencies
> > is bigger than the redundancy value, a serious problem could appear.
> >
> > The probability of this is very small (I've tried to reproduce this
> > problem on my laptop but I've been unable), but it exists.
> >
> > Is there any known issue that prevents gNFS to be stopped with a SIGTERM ?
> > or can it be changed safely ?
> >
> 
> I firmly believe that we need to send SIGTERM as that's the right way to
> gracefully shutdown a running process but what I'd request from NFS folks
> to confirm if there's any background on why it was done with SIGKILL.

No background about this is known to me. I had a quick look through the
git logs, but could not find an explanation.

I agree that SIGTERM would be more appropriate.

Niels