[Bugs] [Bug 1633318] health check fails on restart from crash

bugzilla at redhat.com bugzilla at redhat.com
Mon Jun 17 04:18:54 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1633318



--- Comment #1 from Mohit Agrawal <moagrawa at redhat.com> ---
Hi,

As per health check code, I don't think the existence of health check file
(.glusterfs/health_check) could be the reason for brick failure but I will try
to reproduce it.

In health check thread we do always open  health_check file with mode
(O_CREAT|O_WRONLY|O_TRUNC, 0644)) so even if a file is present open truncate
the data from the file so health check always writes the latest timestamp in
the health_check file.
Here in logs, we can see it is showing the error at the time of comparing
timestamp with health_check file, it means somehow timestamp updated in health
check file is not matching
at the time of reading timestamp from a health_check file.

Are you sure after sending kill signal brick was stopped because somehow if
more than one instances are running then this type of scenario can arise?

1) Please check ps output if the brick was stopped completely.
2) If the brick was stopped completely kindly share volume configuration, I
will try to reproduce the same.


Regards,
Mohit Agrawal

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list