Fri Sep 11 00:25:10 UTC 2020

Hi List,

In my 2-server gluster setup, one server is consistently restarting the 
glusterd proccess.  On the first second of every other minute, I get a 
shutdown in my glusterd log:

W [glusterfsd.c:1596:cleanup_and_exit] 
(-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3) [0x7f7410fa5fa3] 
-->/usr/local/sbin/glusterd(glusterfs_sigwaiter+0xed) [0x55e18d840b8d] 
-->/usr/local/sbin/glusterd(cleanup_and_exit+0x54) [0x55e18d8409e4] ) 
0-: received signum (15), shutting down

Gluster then automatically starts back up, everything remounts and by 
the 8th second log entries stop, and life carries on for another 1 
minute and 53 seconds, and then the shutdown message shows up again.  I 
can provide the full startup log if it will help.

I am experiencing a few hiccups in the server that could possibly be 
traced to this, but I am not sure of that, and generally speaking the 
server doesn't seem to be suffering for this happening.

I have tried tailing all the other logs in conjunction with the glusterd 
log thinking something might show up just before that first second to 
give me a clue what is issuing the shutdown signal, but nothing shows up 
as the culprit.

I have tried setting the log level to DEBUG in the glusterd.sevice 
systemd file.  that didn't seem to work.  As per the red hat manual, I 
also tried setting it on the command line, and the startup message did 
say it started with the debug argument, but it didn't add any thing to 
indicate what is causing it to shut down.

gstatus indicates everything is up and healthy, though it will take a 
couple seconds to run if I run it at the first second after any 
even-numbered minute.

I found an old thread on google about logind.conf file having 
KillUserProcesses=1, it is commented on both servers, but pretty much 
every other mention I found either has no solution or has the same error 
but doesn't match the symptoms otherwise.

Can anyone suggest how I might go about finding out why the one server 
is doing this?
