[Bugs] [Bug 1381970] GlusterFS Daemon stops working after a longer runtime and higher file workload due to design flaws ?

bugzilla at redhat.com bugzilla at redhat.com
Tue Nov 29 22:00:42 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1381970



--- Comment #3 from Giuseppe Ragusa <giuseppe.ragusa at hotmail.com> ---
Slight correction to comment #2 :

As shown by timestamps, the crash are not synchronous, in fact I found the
following workaround:

A cron job on the arbiter node greps the output of "ctdb status" (there is a
CTDB cluster which assigns a couple of IPs on the NFS network: all nodes are
members but the arbiter node does not participate in IP sharing) looking for
UNHEALTHY message (CTDB detects NFS crashes by means of the monitoring script
tracked in BZ #1371178) and if found stops then restarts an unused Gluster
volume which has NFS enabled.

Note that generating a little bit of load like recreating yum metadata on a
CentOS7 os mirror hosted on a NFS share is an almost sure way of getting a
crash, sometimes even corrupting the repodata/.olddata directory (simply
recreating/removing those dirs is enough to restore sanity, apparently).

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=EFPrxQJFkU&a=cc_unsubscribe


More information about the Bugs mailing list