[Gluster-users] GlusterFS, Pacemaker, OCF resource agents on CentOS 7

Fri Dec 8 10:55:37 UTC 2017

Hi Jiffin

Pacemaker clusters allow us to effectively distribute
services across multiple computers.
In my case, I am creating an active-passive cluster for my
software, and my software relies on Apache, MySQL and
GlusterFS. Thus, I want GlusterFS to be controlled by
Pacemaker so that:

1. A node can be deemed "bad" if GlusterFS is not running
(using constraints to prohibit failover to a bad node)
2. The GlusterFS volume can be automatically mounted on
whatever's the active node
3. Services all go into standby together

Is this not the recommended approach? What else should I do?

Thanks

On 08/12/2017 10:17, Jiffin Tony Thottan wrote:
>
> Hi,
>
> Can u please explain for what purpose pacemaker cluster
> used here?
>
> Regards,
>
> Jiffin
>
>
> On Thursday 07 December 2017 06:59 PM, Tomalak Geret'kal
> wrote:
>>
>> Hi guys
>>
>> I'm wondering if anyone here is using the GlusterFS OCF
>> resource agents with Pacemaker on CentOS 7?
>>
>> yum install centos-release-gluster
>> yum install glusterfs-server glusterfs-resource-agents
>>
>> The reason I ask is that there seem to be a few problems
>> with them on 3.10, but these problems are so severe that
>> I'm struggling to believe I'm not just doing something wrong.
>>
>> I created my brick (on a volume previously used for DRBD,
>> thus its name):
>>
>> mkfs.xfs /dev/cl/lv_drbd -f
>> mkdir -p /gluster/test_brick
>> mount -t xfs /dev/cl/lv_drbd /gluster
>>
>> And then my volume (enabling clients to mount it via NFS):
>>
>> systemctl start glusterd
>> gluster volume create logs replica 2 transport tcp
>> pcmk01-drbd:/gluster/test_brick
>> pcmk02-drbd:/gluster/test_brick
>> gluster volume start test_logs
>> gluster volume set test_logs nfs.disable off
>>
>> And here's where the fun starts.
>>
>> Firstly, we need to work around bug 1233344* (which was
>> closed when 3.7 went end-of-life but still seems valid in
>> 3.10):
>>
>> sed -i
>> 's#voldir="/etc/glusterd/vols/${OCF_RESKEY_volname}"#voldir="/var/lib/glusterd/vols/${OCF_RESKEY_volname}"#'
>> /usr/lib/ocf/resource.d/glusterfs/volume
>>
>> With that done, I [attempt to] stop GlusterFS so it can
>> be brought under Pacemaker control:
>>
>> systemctl stop glusterfsd
>> systemctl stop glusterd
>> umount /gluster
>>
>> (I usually have to manually kill glusterfs processes at
>> this point before the unmount works - why does the
>> systemctl stop not do it?)
>>
>> With the node in standby (just one is online in this
>> example, but another is configured), I then set up the
>> resources:
>>
>> pcs node standby
>> pcs resource create gluster_data ocf:heartbeat:Filesystem
>> device="/dev/cl/lv_drbd" directory="/gluster" fstype="xfs"
>> pcs resource create glusterd ocf:glusterfs:glusterd
>> pcs resource create gluster_vol ocf:glusterfs:volume
>> volname="test_logs"
>> pcs resource create test_logs ocf:heartbeat:Filesystem \
>>     device="localhost:/test_logs"
>> directory="/var/log/test" fstype="nfs" \
>>    
>> options="vers=3,tcp,nolock,context=system_u:object_r:httpd_sys_content_t:s0"
>> \
>>     op monitor OCF_CHECK_LEVEL="20"
>> pcs resource clone glusterd
>> pcs resource clone gluster_data
>> pcs resource clone gluster_vol ordered=true
>> pcs constraint order start gluster_data-clone then start
>> glusterd-clone
>> pcs constraint order start glusterd-clone then start
>> gluster_vol-clone
>> pcs constraint order start gluster_vol-clone then start
>> test_logs
>> pcs constraint colocation add test_logs with FloatingIp
>> INFINITY
>>
>> (note the SELinux wrangling - this is because I have a
>> CGI web application which will later need to read files
>> from the /var/log/test mount)
>>
>> At this point, even with the node in standby, it's
>> /already/ failing:
>>
>> [root at pcmk01 ~]# pcs status
>> Cluster name: test_cluster
>> Stack: corosync
>> Current DC: pcmk01-cr (version 1.1.15-11.el7_3.5-e174ec8)
>> - partition WITHOUT quorum
>> Last updated: Thu Dec  7 13:20:41 2017          Last
>> change: Thu Dec  7 13:09:33 2017 by root via
>> crm_attribute on pcmk01-cr
>>
>> 2 nodes and 13 resources configured
>>
>> Online: [ pcmk01-cr ]
>> OFFLINE: [ pcmk02-cr ]
>>
>> Full list of resources:
>>
>>  FloatingIp     (ocf::heartbeat:IPaddr2):       Started
>> pcmk01-cr
>>  test_logs      (ocf::heartbeat:Filesystem):    Stopped
>>  Clone Set: glusterd-clone [glusterd]
>>      Stopped: [ pcmk01-cr pcmk02-cr ]
>>  Clone Set: gluster_data-clone [gluster_data]
>>      Stopped: [ pcmk01-cr pcmk02-cr ]
>>  Clone Set: gluster_vol-clone [gluster_vol]
>>      gluster_vol        (ocf::glusterfs:volume):       
>> FAILED pcmk01-cr (blocked)
>>      Stopped: [ pcmk02-cr ]
>>
>> Failed Actions:
>> * gluster_data_start_0 on pcmk01-cr 'not configured' (6):
>> call=72, status=complete, exitreason='DANGER! xfs on
>> /dev/cl/lv_drbd is NOT cluster-aware!',
>>     last-rc-change='Thu Dec  7 13:09:28 2017',
>> queued=0ms, exec=250ms
>> * gluster_vol_stop_0 on pcmk01-cr 'unknown error' (1):
>> call=60, status=Timed Out, exitreason='none',
>>     last-rc-change='Thu Dec  7 12:55:11 2017',
>> queued=0ms, exec=20004ms
>>
>>
>> Daemon Status:
>>   corosync: active/enabled
>>   pacemaker: active/enabled
>>   pcsd: active/enabled
>>
>> 1. The data mount can't be created? Why?
>> 2. Why is there a volume "stop" command being attempted,
>> and why does it fail?
>> 3. Why is any of this happening in standby? I can't have
>> the resources failing before I've even made the node
>> live! I could understand why a gluster_vol start
>> operation would fail when glusterd is (correctly)
>> stopped, but why is there a *stop* operation? And why
>> does that make the resource "blocked"?
>>
>> Given the above steps, is there something fundamental I'm
>> missing about how these resource agents should be used?
>> How do *you* configure GlusterFS on Pacemaker?
>>
>> Any advice appreciated.
>>
>> Best regards
>>
>>
>> * https://bugzilla.redhat.com/show_bug.cgi?id=1233344
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171208/adc18bd9/attachment.html>