<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Jiffin<br>
<br>
Pacemaker clusters allow us to effectively distribute services
across multiple computers.<br>
In my case, I am creating an active-passive cluster for my
software, and my software relies on Apache, MySQL and GlusterFS.
Thus, I want GlusterFS to be controlled by Pacemaker so that:<br>
<br>
1. A node can be deemed "bad" if GlusterFS is not running (using
constraints to prohibit failover to a bad node)<br>
2. The GlusterFS volume can be automatically mounted on whatever's
the active node<br>
3. Services all go into standby together<br>
<br>
Is this not the recommended approach? What else should I do?<br>
<br>
Thanks<br>
<br>
<br>
On 08/12/2017 10:17, Jiffin Tony Thottan wrote:<br>
</div>
<blockquote type="cite"
cite="mid:0c51ea85-3467-3bff-3939-c8d2c1e648cc@redhat.com">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<p>Hi,</p>
<p>Can u please explain for what purpose pacemaker cluster used
here?</p>
<p>Regards,</p>
<p>Jiffin<br>
</p>
<br>
<div class="moz-cite-prefix">On Thursday 07 December 2017 06:59
PM, Tomalak Geret'kal wrote:<br>
</div>
<blockquote type="cite"
cite="mid:21ae37a9-3a70-d557-7dfb-2a6b012265e2@kera.name">
<meta http-equiv="content-type" content="text/html;
charset=utf-8">
<p>Hi guys</p>
<p>I'm wondering if anyone here is using the GlusterFS OCF
resource agents with Pacemaker on CentOS 7?</p>
<p><tt>yum install centos-release-gluster</tt><tt><br>
</tt><tt>yum install glusterfs-server
glusterfs-resource-agents</tt></p>
<p>The reason I ask is that there seem to be a few problems with
them on 3.10, but these problems are so severe that I'm
struggling to believe I'm not just doing something wrong.</p>
<p>I created my brick (on a volume previously used for DRBD,
thus its name):</p>
<p><tt>mkfs.xfs /dev/cl/lv_drbd -f</tt><tt><br>
</tt><tt>mkdir -p /gluster/test_brick</tt><tt><br>
</tt><tt>mount -t xfs /dev/cl/lv_drbd /gluster</tt><br>
</p>
<p>And then my volume (enabling clients to mount it via NFS):</p>
<p><tt>systemctl start glusterd</tt><tt><br>
</tt><tt>gluster volume create logs replica 2 transport tcp </tt><tt>pcmk01-drbd:/gluster/test_brick
pcmk02-drbd:/gluster/test_brick</tt><tt><br>
</tt><tt>gluster volume start test_logs</tt><tt><br>
</tt><tt>gluster volume set test_logs nfs.disable off</tt><br>
</p>
<p>And here's where the fun starts.</p>
<p>Firstly, we need to work around bug 1233344* (which was
closed when 3.7 went end-of-life but still seems valid in
3.10):</p>
<p><tt>sed -i
's#voldir="/etc/glusterd/vols/${OCF_RESKEY_volname}"#voldir="/var/lib/glusterd/vols/${OCF_RESKEY_volname}"#'
/usr/lib/ocf/resource.d/glusterfs/volume</tt><br>
</p>
<p>With that done, I [attempt to] stop GlusterFS so it can be
brought under Pacemaker control:<br>
</p>
<p><tt>systemctl stop glusterfsd</tt><tt><br>
</tt><tt>systemctl stop glusterd</tt><tt><br>
</tt><tt>umount /gluster</tt></p>
<p>(I usually have to manually kill glusterfs processes at this
point before the unmount works - why does the systemctl stop
not do it?)</p>
<p>With the node in standby (just one is online in this example,
but another is configured), I then set up the resources:</p>
<p><tt>pcs node standby</tt><tt><br>
</tt><tt>pcs resource create gluster_data
ocf:heartbeat:Filesystem device="/dev/cl/lv_drbd"
directory="/gluster" fstype="xfs"</tt><tt><br>
</tt><tt>pcs resource create glusterd ocf:glusterfs:glusterd</tt><tt><br>
</tt><tt>pcs resource create gluster_vol ocf:glusterfs:volume
volname="test_logs"</tt><tt><br>
</tt><tt>pcs resource create test_logs
ocf:heartbeat:Filesystem \</tt><tt><br>
</tt><tt> device="localhost:/test_logs"
directory="/var/log/test" fstype="nfs" \</tt><tt><br>
</tt><tt>
options="vers=3,tcp,nolock,context=system_u:object_r:httpd_sys_content_t:s0"
\</tt><tt><br>
</tt><tt> op monitor OCF_CHECK_LEVEL="20"</tt><tt><br>
</tt><tt>pcs resource clone glusterd</tt><tt><br>
</tt><tt>pcs resource clone gluster_data</tt><tt><br>
</tt><tt>pcs resource clone gluster_vol ordered=true</tt><tt><br>
</tt><tt>pcs constraint order start gluster_data-clone then
start glusterd-clone</tt><tt><br>
</tt><tt>pcs constraint order start glusterd-clone then start
gluster_vol-clone</tt><tt><br>
</tt><tt>pcs constraint order start gluster_vol-clone then
start test_logs</tt><tt><br>
</tt><tt>pcs constraint colocation add test_logs with
FloatingIp INFINITY</tt><br>
</p>
<p>(note the SELinux wrangling - this is because I have a CGI
web application which will later need to read files from the <tt>/var/log/test</tt>
mount)</p>
<p>At this point, even with the node in standby, it's <i>already</i>
failing:</p>
<p><tt>[root@pcmk01 ~]# pcs status</tt><tt><br>
</tt><tt>Cluster name: test_cluster</tt><tt><br>
</tt><tt>Stack: corosync</tt><tt><br>
</tt><tt>Current DC: pcmk01-cr (version
1.1.15-11.el7_3.5-e174ec8) - partition WITHOUT quorum</tt><tt><br>
</tt><tt>Last updated: Thu Dec 7 13:20:41 2017 Last
change: Thu Dec 7 13:09:33 2017 by root via crm_attribute
on pcmk01-cr</tt><tt><br>
</tt><tt><br>
</tt><tt>2 nodes and 13 resources configured</tt><tt><br>
</tt><tt><br>
</tt><tt>Online: [ pcmk01-cr ]</tt><tt><br>
</tt><tt>OFFLINE: [ pcmk02-cr ]</tt><tt><br>
</tt><tt><br>
</tt><tt>Full list of resources:</tt><tt><br>
</tt><tt><br>
</tt><tt> FloatingIp (ocf::heartbeat:IPaddr2):
Started pcmk01-cr</tt><tt><br>
</tt><tt> test_logs (ocf::heartbeat:Filesystem):
Stopped</tt><tt><br>
</tt><tt> Clone Set: glusterd-clone [glusterd]</tt><tt><br>
</tt><tt> Stopped: [ pcmk01-cr pcmk02-cr ]</tt><tt><br>
</tt><tt> Clone Set: gluster_data-clone [gluster_data]</tt><tt><br>
</tt><tt> Stopped: [ pcmk01-cr pcmk02-cr ]</tt><tt><br>
</tt><tt> Clone Set: gluster_vol-clone [gluster_vol]</tt><tt><br>
</tt><tt> gluster_vol
(ocf::glusterfs:volume): FAILED pcmk01-cr (blocked)</tt><tt><br>
</tt><tt> Stopped: [ pcmk02-cr ]</tt><tt><br>
</tt><tt><br>
</tt><tt>Failed Actions:</tt><tt><br>
</tt><tt>* gluster_data_start_0 on pcmk01-cr 'not configured'
(6): call=72, status=complete, exitreason='DANGER! xfs on
/dev/cl/lv_drbd is NOT cluster-aware!',</tt><tt><br>
</tt><tt> last-rc-change='Thu Dec 7 13:09:28 2017',
queued=0ms, exec=250ms</tt><tt><br>
</tt><tt>* gluster_vol_stop_0 on pcmk01-cr 'unknown error'
(1): call=60, status=Timed Out, exitreason='none',</tt><tt><br>
</tt><tt> last-rc-change='Thu Dec 7 12:55:11 2017',
queued=0ms, exec=20004ms</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt>Daemon Status:</tt><tt><br>
</tt><tt> corosync: active/enabled</tt><tt><br>
</tt><tt> pacemaker: active/enabled</tt><tt><br>
</tt><tt> pcsd: active/enabled</tt><br>
<br>
</p>
<p>1. The data mount can't be created? Why?<br>
2. Why is there a volume "stop" command being attempted, and
why does it fail?<br>
3. Why is any of this happening in standby? I can't have the
resources failing before I've even made the node live! I could
understand why a gluster_vol start operation would fail when
glusterd is (correctly) stopped, but why is there a *stop*
operation? And why does that make the resource "blocked"?<br>
</p>
<p>Given the above steps, is there something fundamental I'm
missing about how these resource agents should be used? How do
*you* configure GlusterFS on Pacemaker?</p>
<p>Any advice appreciated.<br>
</p>
<p>Best regards<br>
</p>
<p><br>
</p>
<p>* <a class="moz-txt-link-freetext"
href="https://bugzilla.redhat.com/show_bug.cgi?id=1233344"
moz-do-not-send="true">https://bugzilla.redhat.com/show_bug.cgi?id=1233344</a></p>
<p><br>
</p>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" moz-do-not-send="true">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" moz-do-not-send="true">http://lists.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</body>
</html>