<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi guys</p>
<p>I'm wondering if anyone here is using the GlusterFS OCF resource
agents with Pacemaker on CentOS 7?</p>
<p><tt>yum install centos-release-gluster</tt><tt><br>
</tt><tt>yum install glusterfs-server glusterfs-resource-agents</tt></p>
<p>The reason I ask is that there seem to be a few problems with
them on 3.10, but these problems are so severe that I'm struggling
to believe I'm not just doing something wrong.</p>
<p>I created my brick (on a volume previously used for DRBD, thus
its name):</p>
<p><tt>mkfs.xfs /dev/cl/lv_drbd -f</tt><tt><br>
</tt><tt>mkdir -p /gluster/test_brick</tt><tt><br>
</tt><tt>mount -t xfs /dev/cl/lv_drbd /gluster</tt><br>
</p>
<p>And then my volume (enabling clients to mount it via NFS):</p>
<p><tt>systemctl start glusterd</tt><tt><br>
</tt><tt>gluster volume create logs replica 2 transport tcp </tt><tt>pcmk01-drbd:/gluster/test_brick
pcmk02-drbd:/gluster/test_brick</tt><tt><br>
</tt><tt>gluster volume start test_logs</tt><tt><br>
</tt><tt>gluster volume set test_logs nfs.disable off</tt><br>
</p>
<p>And here's where the fun starts.</p>
<p>Firstly, we need to work around bug 1233344* (which was closed
when 3.7 went end-of-life but still seems valid in 3.10):</p>
<p><tt>sed -i
's#voldir="/etc/glusterd/vols/${OCF_RESKEY_volname}"#voldir="/var/lib/glusterd/vols/${OCF_RESKEY_volname}"#'
/usr/lib/ocf/resource.d/glusterfs/volume</tt><br>
</p>
<p>With that done, I [attempt to] stop GlusterFS so it can be
brought under Pacemaker control:<br>
</p>
<p><tt>systemctl stop glusterfsd</tt><tt><br>
</tt><tt>systemctl stop glusterd</tt><tt><br>
</tt><tt>umount /gluster</tt></p>
<p>(I usually have to manually kill glusterfs processes at this
point before the unmount works - why does the systemctl stop not
do it?)</p>
<p>With the node in standby (just one is online in this example, but
another is configured), I then set up the resources:</p>
<p><tt>pcs node standby</tt><tt><br>
</tt><tt>pcs resource create gluster_data ocf:heartbeat:Filesystem
device="/dev/cl/lv_drbd" directory="/gluster" fstype="xfs"</tt><tt><br>
</tt><tt>pcs resource create glusterd ocf:glusterfs:glusterd</tt><tt><br>
</tt><tt>pcs resource create gluster_vol ocf:glusterfs:volume
volname="test_logs"</tt><tt><br>
</tt><tt>pcs resource create test_logs ocf:heartbeat:Filesystem \</tt><tt><br>
</tt><tt> device="localhost:/test_logs"
directory="/var/log/test" fstype="nfs" \</tt><tt><br>
</tt><tt>
options="vers=3,tcp,nolock,context=system_u:object_r:httpd_sys_content_t:s0"
\</tt><tt><br>
</tt><tt> op monitor OCF_CHECK_LEVEL="20"</tt><tt><br>
</tt><tt>pcs resource clone glusterd</tt><tt><br>
</tt><tt>pcs resource clone gluster_data</tt><tt><br>
</tt><tt>pcs resource clone gluster_vol ordered=true</tt><tt><br>
</tt><tt>pcs constraint order start gluster_data-clone then start
glusterd-clone</tt><tt><br>
</tt><tt>pcs constraint order start glusterd-clone then start
gluster_vol-clone</tt><tt><br>
</tt><tt>pcs constraint order start gluster_vol-clone then start
test_logs</tt><tt><br>
</tt><tt>pcs constraint colocation add test_logs with FloatingIp
INFINITY</tt><br>
</p>
<p>(note the SELinux wrangling - this is because I have a CGI web
application which will later need to read files from the <tt>/var/log/test</tt>
mount)</p>
<p>At this point, even with the node in standby, it's <i>already</i>
failing:</p>
<p><tt>[root@pcmk01 ~]# pcs status</tt><tt><br>
</tt><tt>Cluster name: test_cluster</tt><tt><br>
</tt><tt>Stack: corosync</tt><tt><br>
</tt><tt>Current DC: pcmk01-cr (version 1.1.15-11.el7_3.5-e174ec8)
- partition WITHOUT quorum</tt><tt><br>
</tt><tt>Last updated: Thu Dec 7 13:20:41 2017 Last
change: Thu Dec 7 13:09:33 2017 by root via crm_attribute on
pcmk01-cr</tt><tt><br>
</tt><tt><br>
</tt><tt>2 nodes and 13 resources configured</tt><tt><br>
</tt><tt><br>
</tt><tt>Online: [ pcmk01-cr ]</tt><tt><br>
</tt><tt>OFFLINE: [ pcmk02-cr ]</tt><tt><br>
</tt><tt><br>
</tt><tt>Full list of resources:</tt><tt><br>
</tt><tt><br>
</tt><tt> FloatingIp (ocf::heartbeat:IPaddr2): Started
pcmk01-cr</tt><tt><br>
</tt><tt> test_logs (ocf::heartbeat:Filesystem): Stopped</tt><tt><br>
</tt><tt> Clone Set: glusterd-clone [glusterd]</tt><tt><br>
</tt><tt> Stopped: [ pcmk01-cr pcmk02-cr ]</tt><tt><br>
</tt><tt> Clone Set: gluster_data-clone [gluster_data]</tt><tt><br>
</tt><tt> Stopped: [ pcmk01-cr pcmk02-cr ]</tt><tt><br>
</tt><tt> Clone Set: gluster_vol-clone [gluster_vol]</tt><tt><br>
</tt><tt> gluster_vol (ocf::glusterfs:volume):
FAILED pcmk01-cr (blocked)</tt><tt><br>
</tt><tt> Stopped: [ pcmk02-cr ]</tt><tt><br>
</tt><tt><br>
</tt><tt>Failed Actions:</tt><tt><br>
</tt><tt>* gluster_data_start_0 on pcmk01-cr 'not configured' (6):
call=72, status=complete, exitreason='DANGER! xfs on
/dev/cl/lv_drbd is NOT cluster-aware!',</tt><tt><br>
</tt><tt> last-rc-change='Thu Dec 7 13:09:28 2017',
queued=0ms, exec=250ms</tt><tt><br>
</tt><tt>* gluster_vol_stop_0 on pcmk01-cr 'unknown error' (1):
call=60, status=Timed Out, exitreason='none',</tt><tt><br>
</tt><tt> last-rc-change='Thu Dec 7 12:55:11 2017',
queued=0ms, exec=20004ms</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt>Daemon Status:</tt><tt><br>
</tt><tt> corosync: active/enabled</tt><tt><br>
</tt><tt> pacemaker: active/enabled</tt><tt><br>
</tt><tt> pcsd: active/enabled</tt><br>
<br>
</p>
<p>1. The data mount can't be created? Why?<br>
2. Why is there a volume "stop" command being attempted, and why
does it fail?<br>
3. Why is any of this happening in standby? I can't have the
resources failing before I've even made the node live! I could
understand why a gluster_vol start operation would fail when
glusterd is (correctly) stopped, but why is there a *stop*
operation? And why does that make the resource "blocked"?<br>
</p>
<p>Given the above steps, is there something fundamental I'm missing
about how these resource agents should be used? How do *you*
configure GlusterFS on Pacemaker?</p>
<p>Any advice appreciated.<br>
</p>
<p>Best regards<br>
</p>
<p><br>
</p>
<p>* <a class="moz-txt-link-freetext" href="https://bugzilla.redhat.com/show_bug.cgi?id=1233344">https://bugzilla.redhat.com/show_bug.cgi?id=1233344</a></p>
<p><br>
</p>
</body>
</html>