[Gluster-users] Gluster 3.7.17 distributed-replicated volume experiences almost regular Gluster internal NFS subprocess crash (CentOS 7.2)
Giuseppe Ragusa
giuseppe.ragusa at hotmail.com
Tue Nov 29 22:36:29 UTC 2016
Hi all,
I'm writing to kindly ask for help on the issue in subject line above and documented in:
https://bugzilla.redhat.com/show_bug.cgi?id=1381970
Brief recap:
a 3-node replicated (with arbiter, confined on the same dedicated node for all volumes) distributed volume cluster experiences regular nfs crashes on at least one (non arbiter) node at a time (all two non arbiter nodes crash if given enough time without enacting the workaround cited below); there are no Gluster native clients, only NFS ones, all on a dedicated network.
Simply restarting an NFS-enabled volume restarts the nfs services on all (non arbiter) nodes for all volumes and all seems well up to the next crash (crashes happen many times a day under our normal workload).
Am almost sure way of making nfs crash immediately is recreating the yum metadata directory on a CentOS7 OS mirror repo hosted on a NFS-enabled volume.
Since it is a production cluster and we had to disable various cron jobs that were regularly crashing the internal NFS Gluster part (no NFS-Ganesha in use here), I am almost ready to accept even the upgrade to 3.8.x as a solution (I dare to say so since I've seen various fixes in Gerrit that were not being backported to 3.7 and one I even reported to Bugzilla, cloning the 3.8 bug and kindly asking for a backport, given that the patch applied cleanly; this brings the question: is the backporting of patches to 3.7 being phased out if not explicitly requested for?).
The only caveat could be that the cluster is an hyperconverged setup with oVirt 3.6.7 (but the oVirt part with its dedicated Gluster volumes is working flawlessly and is absolutely not being used to manage Gluster, only to monitor it), so I would need to check for 3.8 compatibility before upgrading.
Many thanks in advance to anyone who can offer any advice on this issue.
Best regards,
Giuseppe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161129/2a6ca8de/attachment.html>
More information about the Gluster-users
mailing list