[Gluster-users] Fwd: nfs-ganesha HA with arbiter volume
Soumya Koduri
skoduri at redhat.com
Tue Sep 22 17:20:28 UTC 2015
On 09/22/2015 02:35 PM, Tiemen Ruiten wrote:
> I missed having passwordless SSH auth for the root user. However it did
> not make a difference:
>
> After verifying prerequisites, issued gluster nfs-ganesha enable on node
> cobalt:
>
> Sep 22 10:19:56 cobalt systemd: Starting Preprocess NFS configuration...
> Sep 22 10:19:56 cobalt systemd: Starting RPC Port Mapper.
> Sep 22 10:19:56 cobalt systemd: Reached target RPC Port Mapper.
> Sep 22 10:19:56 cobalt systemd: Starting Host and Network Name Lookups.
> Sep 22 10:19:56 cobalt systemd: Reached target Host and Network Name
> Lookups.
> Sep 22 10:19:56 cobalt systemd: Starting RPC bind service...
> Sep 22 10:19:56 cobalt systemd: Started Preprocess NFS configuration.
> Sep 22 10:19:56 cobalt systemd: Started RPC bind service.
> Sep 22 10:19:56 cobalt systemd: Starting NFS status monitor for NFSv2/3
> locking....
> Sep 22 10:19:56 cobalt rpc.statd[2666]: Version 1.3.0 starting
> Sep 22 10:19:56 cobalt rpc.statd[2666]: Flags: TI-RPC
> Sep 22 10:19:56 cobalt systemd: Started NFS status monitor for NFSv2/3
> locking..
> Sep 22 10:19:56 cobalt systemd: Starting NFS-Ganesha file server...
> Sep 22 10:19:56 cobalt systemd: Started NFS-Ganesha file server.
> Sep 22 10:19:56 cobalt kernel: warning: `ganesha.nfsd' uses 32-bit
> capabilities (legacy support in use)
> Sep 22 10:19:56 cobalt rpc.statd[2666]: Received SM_UNMON_ALL request
> from cobalt.int.rdmedia.com <http://cobalt.int.rdmedia.com> while not
> monitoring any hosts
> Sep 22 10:19:56 cobalt logger: setting up rd-ganesha-ha
> Sep 22 10:19:56 cobalt logger: setting up cluster rd-ganesha-ha with the
> following cobalt iron
> Sep 22 10:19:57 cobalt systemd: Stopped Pacemaker High Availability
> Cluster Manager.
> Sep 22 10:19:57 cobalt systemd: Stopped Corosync Cluster Engine.
> Sep 22 10:19:57 cobalt systemd: Reloading.
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd: Reloading.
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd: Starting Corosync Cluster Engine...
> Sep 22 10:19:57 cobalt corosync[2815]: [MAIN ] Corosync Cluster Engine
> ('2.3.4'): started and ready to provide service.
> Sep 22 10:19:57 cobalt corosync[2815]: [MAIN ] Corosync built-in
> features: dbus systemd xmlconf snmp pie relro bindnow
> Sep 22 10:19:57 cobalt corosync[2816]: [TOTEM ] Initializing transport
> (UDP/IP Unicast).
> Sep 22 10:19:57 cobalt corosync[2816]: [TOTEM ] Initializing
> transmit/receive security (NSS) crypto: none hash: none
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] The network interface
> [10.100.30.37] is now up.
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV ] Service engine loaded:
> corosync configuration map access [0]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB ] server name: cmap
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV ] Service engine loaded:
> corosync configuration service [1]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB ] server name: cfg
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV ] Service engine loaded:
> corosync cluster closed process group service v1.01 [2]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB ] server name: cpg
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV ] Service engine loaded:
> corosync profile loading service [4]
> Sep 22 10:19:58 cobalt corosync[2816]: [QUORUM] Using quorum provider
> corosync_votequorum
> Sep 22 10:19:58 cobalt corosync[2816]: [VOTEQ ] Waiting for all cluster
> members. Current votes: 1 expected_votes: 2
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV ] Service engine loaded:
> corosync vote quorum service v1.0 [5]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB ] server name: votequorum
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV ] Service engine loaded:
> corosync cluster quorum service v0.1 [3]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB ] server name: quorum
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] adding new UDPU member
> {10.100.30.37}
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] adding new UDPU member
> {10.100.30.38}
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] A new membership
> (10.100.30.37:140 <http://10.100.30.37:140>) was formed. Members joined: 1
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] A new membership
> (10.100.30.37:148 <http://10.100.30.37:148>) was formed. Members joined: 1
> Sep 22 10:19:58 cobalt corosync[2816]: [VOTEQ ] Waiting for all cluster
> members. Current votes: 1 expected_votes: 2
> Sep 22 10:19:58 cobalt corosync[2816]: [VOTEQ ] Waiting for all cluster
> members. Current votes: 1 expected_votes: 2
> Sep 22 10:19:58 cobalt corosync[2816]: [QUORUM] Members[0]:
> Sep 22 10:19:58 cobalt corosync[2816]: [MAIN ] Completed service
> synchronization, ready to provide service.
> *Sep 22 10:21:27 cobalt systemd: corosync.service operation timed out.
> Terminating.*
> *Sep 22 10:21:27 cobalt corosync: Starting Corosync Cluster Engine
> (corosync):*
> *Sep 22 10:21:27 cobalt systemd: Failed to start Corosync Cluster Engine.*
> *Sep 22 10:21:27 cobalt systemd: Unit corosync.service entered failed
> state.*
> Sep 22 10:21:32 cobalt logger: warning: pcs property set
> no-quorum-policy=ignore failed
> Sep 22 10:21:32 cobalt logger: warning: pcs property set
> stonith-enabled=false failed
> Sep 22 10:21:32 cobalt logger: warning: pcs resource create nfs_start
> ganesha_nfsd ha_vol_mnt=/var/run/gluster/shared_storage --clone failed
> Sep 22 10:21:33 cobalt logger: warning: pcs resource delete
> nfs_start-clone failed
> Sep 22 10:21:33 cobalt logger: warning: pcs resource create nfs-mon
> ganesha_mon --clone failed
> Sep 22 10:21:33 cobalt logger: warning: pcs resource create nfs-grace
> ganesha_grace --clone failed
> Sep 22 10:21:34 cobalt logger: warning pcs resource create
> cobalt-cluster_ip-1 ocf:heartbeat:IPaddr ip=10.100.30.101
> cidr_netmask=32 op monitor interval=15s failed
> Sep 22 10:21:34 cobalt logger: warning: pcs resource create
> cobalt-trigger_ip-1 ocf:heartbeat:Dummy failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint colocation add
> cobalt-cluster_ip-1 with cobalt-trigger_ip-1 failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint order
> cobalt-trigger_ip-1 then nfs-grace-clone failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint order
> nfs-grace-clone then cobalt-cluster_ip-1 failed
> Sep 22 10:21:34 cobalt logger: warning pcs resource create
> iron-cluster_ip-1 ocf:heartbeat:IPaddr ip=10.100.30.102 cidr_netmask=32
> op monitor interval=15s failed
> Sep 22 10:21:34 cobalt logger: warning: pcs resource create
> iron-trigger_ip-1 ocf:heartbeat:Dummy failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint colocation add
> iron-cluster_ip-1 with iron-trigger_ip-1 failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint order
> iron-trigger_ip-1 then nfs-grace-clone failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint order
> nfs-grace-clone then iron-cluster_ip-1 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 prefers iron=1000 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 prefers cobalt=2000 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 prefers cobalt=1000 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 prefers iron=2000 failed
> Sep 22 10:21:35 cobalt logger: warning pcs cluster cib-push
> /tmp/tmp.yqLT4m75WG failed
>
> Notice the failed corosync service in bold. I can't find any logs
> pointing to a reason. Starting it manually is not a problem:
>
> Sep 22 10:35:06 cobalt corosync: Starting Corosync Cluster Engine
> (corosync): [ OK ]
>
> Then I noticed pacemaker was not running on both nodes. Started it
> manually and saw the following in /var/log/messages on the other node:
>
> Sep 22 10:36:43 iron cibadmin[4654]: notice: Invoked: /usr/sbin/cibadmin
> --replace -o configuration -V --xml-pipe
> Sep 22 10:36:43 iron crmd[4617]: notice: State transition S_IDLE ->
> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
> origin=abort_transition_graph ]
> Sep 22 10:36:44 iron pengine[4616]: notice: On loss of CCM Quorum: Ignore
> Sep 22 10:36:44 iron pengine[4616]: error: Resource start-up disabled
> since no STONITH resources have been defined
> Sep 22 10:36:44 iron pengine[4616]: error: Either configure some or
> disable STONITH with the stonith-enabled option
> Sep 22 10:36:44 iron pengine[4616]: error: NOTE: Clusters with shared
> data need STONITH to ensure data integrity
> Sep 22 10:36:44 iron pengine[4616]: notice: Delaying fencing operations
> until there are resources to manage
> Sep 22 10:36:44 iron pengine[4616]: warning: Node iron is unclean!
> Sep 22 10:36:44 iron pengine[4616]: notice: Cannot fence unclean nodes
> until quorum is attained (or no-quorum-policy is set to ignore)
> Sep 22 10:36:44 iron pengine[4616]: warning: Calculated Transition 2:
> /var/lib/pacemaker/pengine/pe-warn-20.bz2
> Sep 22 10:36:44 iron pengine[4616]: notice: Configuration ERRORs found
> during PE processing. Please run "crm_verify -L" to identify issues.
> Sep 22 10:36:44 iron crmd[4617]: notice: Transition 2 (Complete=0,
> Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-warn-20.bz2): Complete
> Sep 22 10:36:44 iron crmd[4617]: notice: State transition
> S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL
> origin=notify_crmd ]
>
> I'm starting to think there is some leftover config somewhere from all
> these attempts. Is there a way to completely reset all config related to
> NFS-Ganesha and start over?
>
>
If you disable nfs-ganesha , that should do the cleanup as well.
# gluster nfs-ganesha disable.
If you are still in doubt and to be safe, after disabling nfs-ganesha,
run the below script command
# ./usr/libexec/ganesha/ganesha-ha.sh --cleanup /etc/ganesha
Thanks,
Soumya
>
> On 22 September 2015 at 09:04, Soumya Koduri <skoduri at redhat.com
> <mailto:skoduri at redhat.com>> wrote:
>
> Hi Tiemen,
>
> Have added the steps to configure HA NFS in the below doc. Please
> verify if you have all the pre-requisites done & steps performed right.
>
> https://github.com/soumyakoduri/glusterdocs/blob/ha_guide/Administrator%20Guide/Configuring%20HA%20NFS%20Server.md
>
> Thanks,
> Soumya
>
> On 09/21/2015 09:21 PM, Tiemen Ruiten wrote:
>
> Whoops, replied off-list.
>
> Additionally I noticed that the generated corosync config is not
> valid,
> as there is no interface section:
>
> /etc/corosync/corosync.conf
>
> totem {
> version: 2
> secauth: off
> cluster_name: rd-ganesha-ha
> transport: udpu
> }
>
> nodelist {
> Â node {
> Â Â Â Â ring0_addr: cobalt
> Â Â Â Â nodeid: 1
> Â Â Â Â }
> Â node {
> Â Â Â Â ring0_addr: iron
> Â Â Â Â nodeid: 2
> Â Â Â Â }
> }
>
> quorum {
> provider: corosync_votequorum
> two_node: 1
> }
>
> logging {
> to_syslog: yes
> }
>
>
>
>
> ---------- Forwarded message ----------
> From: *Tiemen Ruiten* <t.ruiten at rdmedia.com
> <mailto:t.ruiten at rdmedia.com> <mailto:t.ruiten at rdmedia.com
> <mailto:t.ruiten at rdmedia.com>>>
> Date: 21 September 2015 at 17:16
> Subject: Re: [Gluster-users] nfs-ganesha HA with arbiter volume
> To: Jiffin Tony Thottan <jthottan at redhat.com
> <mailto:jthottan at redhat.com> <mailto:jthottan at redhat.com
> <mailto:jthottan at redhat.com>>>
>
>
> Could you point me to the latest documentation? I've been
> struggling to
> find something up-to-date. I believe I have all the prerequisites:
>
> - shared storage volume exists and is mounted
> - all nodes in hosts files
> - Gluster-NFS disabled
> - corosync, pacemaker and nfs-ganesha rpm's installed
>
> Anything I missed?
>
> Everything has been installed by RPM so is in the default locations:
> /usr/libexec/ganesha/ganesha-ha.sh
> /etc/ganesha/ganesha.conf (empty)
> /etc/ganesha/ganesha-ha.conf
>
> After I started the pcsd service manually, nfs-ganesha could be
> enabled
> successfully, but there was no virtual IP present on the
> interfaces and
> looking at the system log, I noticed corosync failed to start:
>
> - on the host where I issued the gluster nfs-ganesha enable command:
>
> Sep 21 17:07:18 iron systemd: Starting NFS-Ganesha file server...
> Sep 21 17:07:19 iron systemd: Started NFS-Ganesha file server.
> Sep 21 17:07:19 iron rpc.statd[2409]: Received SM_UNMON_ALL
> request from
> iron.int.rdmedia.com <http://iron.int.rdmedia.com>
> <http://iron.int.rdmedia.com> while not monitoring
> any hosts
> Sep 21 17:07:20 iron systemd: Starting Corosync Cluster Engine...
> Sep 21 17:07:20 iron corosync[3426]: [MAIN Â ] Corosync Cluster
> Engine
> ('2.3.4'): started and ready to provide service.
> Sep 21 17:07:20 iron corosync[3426]: [MAIN Â ] Corosync built-in
> features: dbus systemd xmlconf snmp pie relro bindnow
> Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing transport
> (UDP/IP Unicast).
> Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing
> transmit/receive security (NSS) crypto: none hash: none
> Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] The network interface
> [10.100.30.38] is now up.
> Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
> loaded:
> corosync configuration map access [0]
> Sep 21 17:07:20 iron corosync[3427]: [QB Â Â ] server name: cmap
> Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
> loaded:
> corosync configuration service [1]
> Sep 21 17:07:20 iron corosync[3427]: [QB Â Â ] server name: cfg
> Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
> loaded:
> corosync cluster closed process group service v1.01 [2]
> Sep 21 17:07:20 iron corosync[3427]: [QB Â Â ] server name: cpg
> Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
> loaded:
> corosync profile loading service [4]
> Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Using quorum provider
> corosync_votequorum
> Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
> loaded:
> corosync vote quorum service v1.0 [5]
> Sep 21 17:07:20 iron corosync[3427]: [QB Â Â ] server name:
> votequorum
> Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
> loaded:
> corosync cluster quorum service v0.1 [3]
> Sep 21 17:07:20 iron corosync[3427]: [QB Â Â ] server name: quorum
> Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU member
> {10.100.30.38}
> Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU member
> {10.100.30.37}
> Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership
> (10.100.30.38:104 <http://10.100.30.38:104>
> <http://10.100.30.38:104>) was formed. Members joined: 1
> Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Members[1]: 1
> Sep 21 17:07:20 iron corosync[3427]: [MAIN Â ] Completed service
> synchronization, ready to provide service.
> Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership
> (10.100.30.37:108 <http://10.100.30.37:108>
> <http://10.100.30.37:108>) was formed. Members joined: 1
>
> Sep 21 17:08:21 iron corosync: Starting Corosync Cluster Engine
> (corosync): [FAILED]
> Sep 21 17:08:21 iron systemd: corosync.service: control process
> exited,
> code=exited status=1
> Sep 21 17:08:21 iron systemd: Failed to start Corosync Cluster
> Engine.
> Sep 21 17:08:21 iron systemd: Unit corosync.service entered
> failed state.
>
>
> - on the other host:
>
> Sep 21 17:07:19 cobalt systemd: Starting Preprocess NFS
> configuration...
> Sep 21 17:07:19 cobalt systemd: Starting RPC Port Mapper.
> Sep 21 17:07:19 cobalt systemd: Reached target RPC Port Mapper.
> Sep 21 17:07:19 cobalt systemd: Starting Host and Network Name
> Lookups.
> Sep 21 17:07:19 cobalt systemd: Reached target Host and Network Name
> Lookups.
> Sep 21 17:07:19 cobalt systemd: Starting RPC bind service...
> Sep 21 17:07:19 cobalt systemd: Started Preprocess NFS
> configuration.
> Sep 21 17:07:19 cobalt systemd: Started RPC bind service.
> Sep 21 17:07:19 cobalt systemd: Starting NFS status monitor for
> NFSv2/3
> locking....
> Sep 21 17:07:19 cobalt rpc.statd[2662]: Version 1.3.0 starting
> Sep 21 17:07:19 cobalt rpc.statd[2662]: Flags: TI-RPC
> Sep 21 17:07:19 cobalt systemd: Started NFS status monitor for
> NFSv2/3
> locking..
> Sep 21 17:07:19 cobalt systemd: Starting NFS-Ganesha file server...
> Sep 21 17:07:19 cobalt systemd: Started NFS-Ganesha file server.
> Sep 21 17:07:19 cobalt kernel: warning: `ganesha.nfsd' uses 32-bit
> capabilities (legacy support in use)
> Sep 21 17:07:19 cobalt logger: setting up rd-ganesha-ha
> Sep 21 17:07:19 cobalt rpc.statd[2662]: Received SM_UNMON_ALL
> request
> from cobalt.int.rdmedia.com <http://cobalt.int.rdmedia.com>
> <http://cobalt.int.rdmedia.com> while not
> monitoring any hosts
> Sep 21 17:07:19 cobalt logger: setting up cluster rd-ganesha-ha
> with the
> following cobalt iron
> Sep 21 17:07:20 cobalt systemd: Stopped Pacemaker High Availability
> Cluster Manager.
> Sep 21 17:07:20 cobalt systemd: Stopped Corosync Cluster Engine.
> Sep 21 17:07:20 cobalt systemd: Reloading.
> Sep 21 17:07:20 cobalt systemd:
> [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 21 17:07:20 cobalt systemd:
> [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 21 17:07:20 cobalt systemd: Reloading.
> Sep 21 17:07:20 cobalt systemd:
> [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 21 17:07:20 cobalt systemd:
> [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 21 17:07:20 cobalt systemd: Starting Corosync Cluster Engine...
> Sep 21 17:07:20 cobalt corosync[2816]: [MAIN Â ] Corosync
> Cluster Engine
> ('2.3.4'): started and ready to provide service.
> Sep 21 17:07:20 cobalt corosync[2816]: [MAIN Â ] Corosync built-in
> features: dbus systemd xmlconf snmp pie relro bindnow
> Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing
> transport
> (UDP/IP Unicast).
> Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing
> transmit/receive security (NSS) crypto: none hash: none
> Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] The network
> interface
> [10.100.30.37] is now up.
> Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
> loaded:
> corosync configuration map access [0]
> Sep 21 17:07:21 cobalt corosync[2817]: [QB Â Â ] server name: cmap
> Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
> loaded:
> corosync configuration service [1]
> Sep 21 17:07:21 cobalt corosync[2817]: [QB Â Â ] server name: cfg
> Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
> loaded:
> corosync cluster closed process group service v1.01 [2]
> Sep 21 17:07:21 cobalt corosync[2817]: [QB Â Â ] server name: cpg
> Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
> loaded:
> corosync profile loading service [4]
> Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Using quorum
> provider
> corosync_votequorum
> Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
> loaded:
> corosync vote quorum service v1.0 [5]
> Sep 21 17:07:21 cobalt corosync[2817]: [QB Â Â ] server name:
> votequorum
> Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
> loaded:
> corosync cluster quorum service v0.1 [3]
> Sep 21 17:07:21 cobalt corosync[2817]: [QB Â Â ] server name:
> quorum
> Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU
> member
> {10.100.30.37}
> Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU
> member
> {10.100.30.38}
> Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership
> (10.100.30.37:100 <http://10.100.30.37:100>
> <http://10.100.30.37:100>) was formed. Members joined: 1
> Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
> Sep 21 17:07:21 cobalt corosync[2817]: [MAIN Â ] Completed service
> synchronization, ready to provide service.
> Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership
> (10.100.30.37:108 <http://10.100.30.37:108>
> <http://10.100.30.37:108>) was formed. Members joined: 1
> Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
> cluster
> members. Current votes: 1 expected_votes: 2
> Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
> Sep 21 17:07:21 cobalt corosync[2817]: [MAIN Â ] Completed service
>
> synchronization, ready to provide service.
> Sep 21 17:08:50 cobalt systemd: corosync.service operation timed
> out.
> Terminating.
> Sep 21 17:08:50 cobalt corosync: Starting Corosync Cluster Engine
> (corosync):
> Sep 21 17:08:50 cobalt systemd: Failed to start Corosync Cluster
> Engine.
> Sep 21 17:08:50 cobalt systemd: Unit corosync.service entered
> failed state.
> Sep 21 17:08:55 cobalt logger: warning: pcs property set
> no-quorum-policy=ignore failed
> Sep 21 17:08:55 cobalt logger: warning: pcs property set
> stonith-enabled=false failed
> Sep 21 17:08:55 cobalt logger: warning: pcs resource create
> nfs_start
> ganesha_nfsd ha_vol_mnt=/var/run/gluster/shared_storage --clone
> failed
> Sep 21 17:08:56 cobalt logger: warning: pcs resource delete
> nfs_start-clone failed
> Sep 21 17:08:56 cobalt logger: warning: pcs resource create nfs-mon
> ganesha_mon --clone failed
> Sep 21 17:08:56 cobalt logger: warning: pcs resource create
> nfs-grace
> ganesha_grace --clone failed
> Sep 21 17:08:57 cobalt logger: warning pcs resource create
> cobalt-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op
> monitor
> interval=15s failed
> Sep 21 17:08:57 cobalt logger: warning: pcs resource create
> cobalt-trigger_ip-1 ocf:heartbeat:Dummy failed
> Sep 21 17:08:57 cobalt logger: warning: pcs constraint
> colocation add
> cobalt-cluster_ip-1 with cobalt-trigger_ip-1 failed
> Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
> cobalt-trigger_ip-1 then nfs-grace-clone failed
> Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
> nfs-grace-clone then cobalt-cluster_ip-1 failed
> Sep 21 17:08:57 cobalt logger: warning pcs resource create
> iron-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op
> monitor
> interval=15s failed
> Sep 21 17:08:57 cobalt logger: warning: pcs resource create
> iron-trigger_ip-1 ocf:heartbeat:Dummy failed
> Sep 21 17:08:57 cobalt logger: warning: pcs constraint
> colocation add
> iron-cluster_ip-1 with iron-trigger_ip-1 failed
> Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
> iron-trigger_ip-1 then nfs-grace-clone failed
> Sep 21 17:08:58 cobalt logger: warning: pcs constraint order
> nfs-grace-clone then iron-cluster_ip-1 failed
> Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
> Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 prefers iron=1000 failed
> Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 prefers cobalt=2000 failed
> Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
> Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 prefers cobalt=1000 failed
> Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 prefers iron=2000 failed
> Sep 21 17:08:58 cobalt logger: warning pcs cluster cib-push
> /tmp/tmp.nXTfyA1GMR failed
> Sep 21 17:08:58 cobalt logger: warning: scp ganesha-ha.conf to
> cobalt failed
>
> BTW, I'm using CentOS 7. There are multiple network interfaces
> on the
> servers, could that be a problem?Â
>
>
>
>
> On 21 September 2015 at 11:48, Jiffin Tony Thottan
> <jthottan at redhat.com <mailto:jthottan at redhat.com>
> <mailto:jthottan at redhat.com <mailto:jthottan at redhat.com>>> wrote:
>
>
>
> On 21/09/15 13:56, Tiemen Ruiten wrote:
>
> Hello Soumya, Kaleb, list,
>
> This Friday I created the gluster_shared_storage volume
> manually,
> I just tried it with the command you supplied, but both
> have the
> same result:
>
> from etc-glusterfs-glusterd.vol.log on the node where I
> issued the
> command:
>
> [2015-09-21 07:59:47.756845] I [MSGID: 106474]
> [glusterd-ganesha.c:403:check_host_list] 0-management:
> ganesha
> host found Hostname is cobalt
> [2015-09-21 07:59:48.071755] I [MSGID: 106474]
> [glusterd-ganesha.c:349:is_ganesha_host] 0-management:
> ganesha
> host found Hostname is cobalt
> [2015-09-21 07:59:48.653879] E [MSGID: 106470]
> [glusterd-ganesha.c:264:glusterd_op_set_ganesha]
> 0-management:
> Initial NFS-Ganesha set up failed
>
>
> As far as what I understand from the logs, it called
> setup_cluser()[calls `ganesha-ha.sh` script ] but script
> failed.
> Can u please provide following details :
> -Location of ganesha.sh file??
> -Location of ganesha-ha.conf, ganesha.conf files ?
>
>
> And also can u cross check whether all the prerequisites
> before HA
> setup satisfied ?
>
> --
> With Regards,
> Jiffin
>
>
> [2015-09-21 07:59:48.653912] E [MSGID: 106123]
> [glusterd-syncop.c:1404:gd_commit_op_phase]
> 0-management: Commit
> of operation 'Volume (null)' failed on localhost :
> Failed to set
> up HA config for NFS-Ganesha. Please check the log file
> for details
> [2015-09-21 07:59:45.402458] I [MSGID: 106006]
> [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify]
> 0-management: nfs has disconnected from glusterd.
> [2015-09-21 07:59:48.071578] I [MSGID: 106474]
> [glusterd-ganesha.c:403:check_host_list] 0-management:
> ganesha
> host found Hostname is cobalt
>
> from etc-glusterfs-glusterd.vol.log on the other node:
>
> [2015-09-21 08:12:50.111877] E [MSGID: 106062]
> [glusterd-op-sm.c:3698:glusterd_op_ac_unlock]
> 0-management: Unable
> to acquire volname
> [2015-09-21 08:14:50.548087] E [MSGID: 106062]
> [glusterd-op-sm.c:3635:glusterd_op_ac_lock]
> 0-management: Unable
> to acquire volname
> [2015-09-21 08:14:50.654746] I [MSGID: 106132]
> [glusterd-proc-mgmt.c:83:glusterd_proc_stop]
> 0-management: nfs
> already stopped
> [2015-09-21 08:14:50.655095] I [MSGID: 106474]
> [glusterd-ganesha.c:403:check_host_list] 0-management:
> ganesha
> host found Hostname is cobalt
> [2015-09-21 08:14:51.287156] E [MSGID: 106062]
> [glusterd-op-sm.c:3698:glusterd_op_ac_unlock]
> 0-management: Unable
> to acquire volname
>
>
> from etc-glusterfs-glusterd.vol.log on the arbiter node:
>
> [2015-09-21 08:18:50.934713] E [MSGID: 101075]
> [common-utils.c:3127:gf_is_local_addr] 0-management:
> error in
> getaddrinfo: Name or service not known
> [2015-09-21 08:18:51.504694] E [MSGID: 106062]
> [glusterd-op-sm.c:3698:glusterd_op_ac_unlock]
> 0-management: Unable
> to acquire volname
>
> I have put the hostnames of all servers in my
> /etc/hosts file,
> including the arbiter node.
>
>
> On 18 September 2015 at 16:52, Soumya Koduri
> <skoduri at redhat.com <mailto:skoduri at redhat.com>
> <mailto:skoduri at redhat.com
> <mailto:skoduri at redhat.com>>> wrote:
>
> Hi Tiemen,
>
> One of the pre-requisites before setting up
> nfs-ganesha HA is
> to create and mount shared_storage volume. Use
> below CLI for that
>
> "gluster volume set all
> cluster.enable-shared-storage enable"
>
> It shall create the volume and mount in all the nodes
> (including the arbiter node). Note this volume shall be
> mounted on all the nodes of the gluster storage
> pool (though
> in this case it may not be part of nfs-ganesha
> cluster).
>
> So instead of manually creating those directory
> paths, please
> use above CLI and try re-configuring the setup.
>
> Thanks,
> Soumya
>
> On 09/18/2015 07:29 PM, Tiemen Ruiten wrote:
>
> Hello Kaleb,
>
> I don't:
>
> # Name of the HA cluster created.
> # must be unique within the subnet
> HA_NAME="rd-ganesha-ha"
> #
> # The gluster server from which to mount the
> shared data
> volume.
> HA_VOL_SERVER="iron"
> #
> # N.B. you may use short names or long names;
> you may not
> use IP addrs.
> # Once you select one, stay with it as it will
> be mildly
> unpleasant to
> # clean up if you switch later on. Ensure that
> all names -
> short and/or
> # long - are in DNS or /etc/hosts on all
> machines in the
> cluster.
> #
> # The subset of nodes of the Gluster Trusted
> Pool that
> form the ganesha
> # HA cluster. Hostname is specified.
> HA_CLUSTER_NODES="cobalt,iron"
> #HA_CLUSTER_NODES="server1.lab.redhat.com
> <http://server1.lab.redhat.com>
> <http://server1.lab.redhat.com>
>
> <http://server1.lab.redhat.com>,server2.lab.redhat.com
> <http://server2.lab.redhat.com>
> <http://server2.lab.redhat.com>
> <http://server2.lab.redhat.com>,..."
> #
> # Virtual IPs for each of the nodes specified
> above.
> VIP_server1="10.100.30.101"
> VIP_server2="10.100.30.102"
> #VIP_server1_lab_redhat_com="10.0.2.1"
> #VIP_server2_lab_redhat_com="10.0.2.2"
>
> hosts cobalt & iron are the data nodes, the arbiter
> ip/hostname (neon)
> isn't mentioned anywhere in this config file.
>
>
> On 18 September 2015 at 15:56, Kaleb S. KEITHLEY
> <<mailto:kkeithle at redhat.com
> <mailto:kkeithle at redhat.com>>kkeithle at redhat.com
> <mailto:kkeithle at redhat.com>
> <mailto:kkeithle at redhat.com
> <mailto:kkeithle at redhat.com>>
> <mailto:kkeithle at redhat.com
> <mailto:kkeithle at redhat.com> <mailto:kkeithle at redhat.com
> <mailto:kkeithle at redhat.com>>>>
> wrote:
>
> Â Â On 09/18/2015 09:46 AM, Tiemen Ruiten wrote:
> Â Â > Hello,
> Â Â >
> Â Â > I have a Gluster cluster with a single
> replica 3,
> arbiter 1 volume (so
> Â Â > two nodes with actual data, one arbiter
> node). I
> would like to setup
> Â Â > NFS-Ganesha HA for this volume but I'm
> having some
> difficulties.
> Â Â >
> Â Â > - I needed to create a directory
> /var/run/gluster/shared_storage
> Â Â > manually on all nodes, or the command
> 'gluster
> nfs-ganesha enable would
> Â Â > fail with the following error:
> Â Â > [2015-09-18 13:13:34.690416] E [MSGID:
> 106032]
> Â Â > [glusterd-ganesha.c:708:pre_setup]
> 0-THIS->name:
> mkdir() failed on path
> Â Â >
> /var/run/gluster/shared_storage/nfs-ganesha, [No
> such file or directory]
> Â Â >
> Â Â > - Then I found out that the command
> connects to
> the arbiter node as
> Â Â > well, but obviously I don't want to set up
> NFS-Ganesha there. Is it
> Â Â > actually possible to setup NFS-Ganesha
> HA with an
> arbiter node? If it's
> Â Â > possible, is there any documentation on
> how to do
> that?
> Â Â >
>
> Â Â Please send the
> /etc/ganesha/ganesha-ha.conf file
> you're using.
>
> Â Â Probably you have included the arbiter in
> your HA
> config; that would be
> Â Â a mistake.
>
> Â Â --
>
> Â Â Kaleb
>
>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org>>
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org>>
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org>>
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
More information about the Gluster-users
mailing list