[Gluster-users] GlusterFS, Pacemaker, OCF resource agents on CentOS 7

Fri Dec 8 10:17:01 UTC 2017

Hi,

Can u please explain for what purpose pacemaker cluster used here?

Regards,

Jiffin

On Thursday 07 December 2017 06:59 PM, Tomalak Geret'kal wrote:
>
> Hi guys
>
> I'm wondering if anyone here is using the GlusterFS OCF resource 
> agents with Pacemaker on CentOS 7?
>
> yum install centos-release-gluster
> yum install glusterfs-server glusterfs-resource-agents
>
> The reason I ask is that there seem to be a few problems with them on 
> 3.10, but these problems are so severe that I'm struggling to believe 
> I'm not just doing something wrong.
>
> I created my brick (on a volume previously used for DRBD, thus its name):
>
> mkfs.xfs /dev/cl/lv_drbd -f
> mkdir -p /gluster/test_brick
> mount -t xfs /dev/cl/lv_drbd /gluster
>
> And then my volume (enabling clients to mount it via NFS):
>
> systemctl start glusterd
> gluster volume create logs replica 2 transport tcp 
> pcmk01-drbd:/gluster/test_brick pcmk02-drbd:/gluster/test_brick
> gluster volume start test_logs
> gluster volume set test_logs nfs.disable off
>
> And here's where the fun starts.
>
> Firstly, we need to work around bug 1233344* (which was closed when 
> 3.7 went end-of-life but still seems valid in 3.10):
>
> sed -i 
> 's#voldir="/etc/glusterd/vols/${OCF_RESKEY_volname}"#voldir="/var/lib/glusterd/vols/${OCF_RESKEY_volname}"#' 
> /usr/lib/ocf/resource.d/glusterfs/volume
>
> With that done, I [attempt to] stop GlusterFS so it can be brought 
> under Pacemaker control:
>
> systemctl stop glusterfsd
> systemctl stop glusterd
> umount /gluster
>
> (I usually have to manually kill glusterfs processes at this point 
> before the unmount works - why does the systemctl stop not do it?)
>
> With the node in standby (just one is online in this example, but 
> another is configured), I then set up the resources:
>
> pcs node standby
> pcs resource create gluster_data ocf:heartbeat:Filesystem 
> device="/dev/cl/lv_drbd" directory="/gluster" fstype="xfs"
> pcs resource create glusterd ocf:glusterfs:glusterd
> pcs resource create gluster_vol ocf:glusterfs:volume volname="test_logs"
> pcs resource create test_logs ocf:heartbeat:Filesystem \
>     device="localhost:/test_logs" directory="/var/log/test" fstype="nfs" \
> options="vers=3,tcp,nolock,context=system_u:object_r:httpd_sys_content_t:s0" 
> \
>     op monitor OCF_CHECK_LEVEL="20"
> pcs resource clone glusterd
> pcs resource clone gluster_data
> pcs resource clone gluster_vol ordered=true
> pcs constraint order start gluster_data-clone then start glusterd-clone
> pcs constraint order start glusterd-clone then start gluster_vol-clone
> pcs constraint order start gluster_vol-clone then start test_logs
> pcs constraint colocation add test_logs with FloatingIp INFINITY
>
> (note the SELinux wrangling - this is because I have a CGI web 
> application which will later need to read files from the /var/log/test 
> mount)
>
> At this point, even with the node in standby, it's /already/ failing:
>
> [root at pcmk01 ~]# pcs status
> Cluster name: test_cluster
> Stack: corosync
> Current DC: pcmk01-cr (version 1.1.15-11.el7_3.5-e174ec8) - partition 
> WITHOUT quorum
> Last updated: Thu Dec  7 13:20:41 2017          Last change: Thu Dec  
> 7 13:09:33 2017 by root via crm_attribute on pcmk01-cr
>
> 2 nodes and 13 resources configured
>
> Online: [ pcmk01-cr ]
> OFFLINE: [ pcmk02-cr ]
>
> Full list of resources:
>
>  FloatingIp     (ocf::heartbeat:IPaddr2):       Started pcmk01-cr
>  test_logs      (ocf::heartbeat:Filesystem):    Stopped
>  Clone Set: glusterd-clone [glusterd]
>      Stopped: [ pcmk01-cr pcmk02-cr ]
>  Clone Set: gluster_data-clone [gluster_data]
>      Stopped: [ pcmk01-cr pcmk02-cr ]
>  Clone Set: gluster_vol-clone [gluster_vol]
>      gluster_vol        (ocf::glusterfs:volume): FAILED pcmk01-cr 
> (blocked)
>      Stopped: [ pcmk02-cr ]
>
> Failed Actions:
> * gluster_data_start_0 on pcmk01-cr 'not configured' (6): call=72, 
> status=complete, exitreason='DANGER! xfs on /dev/cl/lv_drbd is NOT 
> cluster-aware!',
>     last-rc-change='Thu Dec  7 13:09:28 2017', queued=0ms, exec=250ms
> * gluster_vol_stop_0 on pcmk01-cr 'unknown error' (1): call=60, 
> status=Timed Out, exitreason='none',
>     last-rc-change='Thu Dec  7 12:55:11 2017', queued=0ms, exec=20004ms
>
>
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
>
> 1. The data mount can't be created? Why?
> 2. Why is there a volume "stop" command being attempted, and why does 
> it fail?
> 3. Why is any of this happening in standby? I can't have the resources 
> failing before I've even made the node live! I could understand why a 
> gluster_vol start operation would fail when glusterd is (correctly) 
> stopped, but why is there a *stop* operation? And why does that make 
> the resource "blocked"?
>
> Given the above steps, is there something fundamental I'm missing 
> about how these resource agents should be used? How do *you* configure 
> GlusterFS on Pacemaker?
>
> Any advice appreciated.
>
> Best regards
>
>
> * https://bugzilla.redhat.com/show_bug.cgi?id=1233344
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171208/c2780cd1/attachment.html>