[Gluster-users] Fwd: nfs-ganesha HA with arbiter volume

Soumya Koduri skoduri at redhat.com
Tue Sep 22 17:20:28 UTC 2015



On 09/22/2015 02:35 PM, Tiemen Ruiten wrote:
> I missed having passwordless SSH auth for the root user. However it did
> not make a difference:
>
> After verifying prerequisites, issued gluster nfs-ganesha enable on node
> cobalt:
>
> Sep 22 10:19:56 cobalt systemd: Starting Preprocess NFS configuration...
> Sep 22 10:19:56 cobalt systemd: Starting RPC Port Mapper.
> Sep 22 10:19:56 cobalt systemd: Reached target RPC Port Mapper.
> Sep 22 10:19:56 cobalt systemd: Starting Host and Network Name Lookups.
> Sep 22 10:19:56 cobalt systemd: Reached target Host and Network Name
> Lookups.
> Sep 22 10:19:56 cobalt systemd: Starting RPC bind service...
> Sep 22 10:19:56 cobalt systemd: Started Preprocess NFS configuration.
> Sep 22 10:19:56 cobalt systemd: Started RPC bind service.
> Sep 22 10:19:56 cobalt systemd: Starting NFS status monitor for NFSv2/3
> locking....
> Sep 22 10:19:56 cobalt rpc.statd[2666]: Version 1.3.0 starting
> Sep 22 10:19:56 cobalt rpc.statd[2666]: Flags: TI-RPC
> Sep 22 10:19:56 cobalt systemd: Started NFS status monitor for NFSv2/3
> locking..
> Sep 22 10:19:56 cobalt systemd: Starting NFS-Ganesha file server...
> Sep 22 10:19:56 cobalt systemd: Started NFS-Ganesha file server.
> Sep 22 10:19:56 cobalt kernel: warning: `ganesha.nfsd' uses 32-bit
> capabilities (legacy support in use)
> Sep 22 10:19:56 cobalt rpc.statd[2666]: Received SM_UNMON_ALL request
> from cobalt.int.rdmedia.com <http://cobalt.int.rdmedia.com> while not
> monitoring any hosts
> Sep 22 10:19:56 cobalt logger: setting up rd-ganesha-ha
> Sep 22 10:19:56 cobalt logger: setting up cluster rd-ganesha-ha with the
> following cobalt iron
> Sep 22 10:19:57 cobalt systemd: Stopped Pacemaker High Availability
> Cluster Manager.
> Sep 22 10:19:57 cobalt systemd: Stopped Corosync Cluster Engine.
> Sep 22 10:19:57 cobalt systemd: Reloading.
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd: Reloading.
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd:
> [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
> 'RemoveOnStop' in section 'Socket'
> Sep 22 10:19:57 cobalt systemd: Starting Corosync Cluster Engine...
> Sep 22 10:19:57 cobalt corosync[2815]: [MAIN  ] Corosync Cluster Engine
> ('2.3.4'): started and ready to provide service.
> Sep 22 10:19:57 cobalt corosync[2815]: [MAIN  ] Corosync built-in
> features: dbus systemd xmlconf snmp pie relro bindnow
> Sep 22 10:19:57 cobalt corosync[2816]: [TOTEM ] Initializing transport
> (UDP/IP Unicast).
> Sep 22 10:19:57 cobalt corosync[2816]: [TOTEM ] Initializing
> transmit/receive security (NSS) crypto: none hash: none
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] The network interface
> [10.100.30.37] is now up.
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV  ] Service engine loaded:
> corosync configuration map access [0]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB    ] server name: cmap
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV  ] Service engine loaded:
> corosync configuration service [1]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB    ] server name: cfg
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV  ] Service engine loaded:
> corosync cluster closed process group service v1.01 [2]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB    ] server name: cpg
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV  ] Service engine loaded:
> corosync profile loading service [4]
> Sep 22 10:19:58 cobalt corosync[2816]: [QUORUM] Using quorum provider
> corosync_votequorum
> Sep 22 10:19:58 cobalt corosync[2816]: [VOTEQ ] Waiting for all cluster
> members. Current votes: 1 expected_votes: 2
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV  ] Service engine loaded:
> corosync vote quorum service v1.0 [5]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB    ] server name: votequorum
> Sep 22 10:19:58 cobalt corosync[2816]: [SERV  ] Service engine loaded:
> corosync cluster quorum service v0.1 [3]
> Sep 22 10:19:58 cobalt corosync[2816]: [QB    ] server name: quorum
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] adding new UDPU member
> {10.100.30.37}
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] adding new UDPU member
> {10.100.30.38}
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] A new membership
> (10.100.30.37:140 <http://10.100.30.37:140>) was formed. Members joined: 1
> Sep 22 10:19:58 cobalt corosync[2816]: [TOTEM ] A new membership
> (10.100.30.37:148 <http://10.100.30.37:148>) was formed. Members joined: 1
> Sep 22 10:19:58 cobalt corosync[2816]: [VOTEQ ] Waiting for all cluster
> members. Current votes: 1 expected_votes: 2
> Sep 22 10:19:58 cobalt corosync[2816]: [VOTEQ ] Waiting for all cluster
> members. Current votes: 1 expected_votes: 2
> Sep 22 10:19:58 cobalt corosync[2816]: [QUORUM] Members[0]:
> Sep 22 10:19:58 cobalt corosync[2816]: [MAIN  ] Completed service
> synchronization, ready to provide service.
> *Sep 22 10:21:27 cobalt systemd: corosync.service operation timed out.
> Terminating.*
> *Sep 22 10:21:27 cobalt corosync: Starting Corosync Cluster Engine
> (corosync):*
> *Sep 22 10:21:27 cobalt systemd: Failed to start Corosync Cluster Engine.*
> *Sep 22 10:21:27 cobalt systemd: Unit corosync.service entered failed
> state.*
> Sep 22 10:21:32 cobalt logger: warning: pcs property set
> no-quorum-policy=ignore failed
> Sep 22 10:21:32 cobalt logger: warning: pcs property set
> stonith-enabled=false failed
> Sep 22 10:21:32 cobalt logger: warning: pcs resource create nfs_start
> ganesha_nfsd ha_vol_mnt=/var/run/gluster/shared_storage --clone failed
> Sep 22 10:21:33 cobalt logger: warning: pcs resource delete
> nfs_start-clone failed
> Sep 22 10:21:33 cobalt logger: warning: pcs resource create nfs-mon
> ganesha_mon --clone failed
> Sep 22 10:21:33 cobalt logger: warning: pcs resource create nfs-grace
> ganesha_grace --clone failed
> Sep 22 10:21:34 cobalt logger: warning pcs resource create
> cobalt-cluster_ip-1 ocf:heartbeat:IPaddr ip=10.100.30.101
> cidr_netmask=32 op monitor interval=15s failed
> Sep 22 10:21:34 cobalt logger: warning: pcs resource create
> cobalt-trigger_ip-1 ocf:heartbeat:Dummy failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint colocation add
> cobalt-cluster_ip-1 with cobalt-trigger_ip-1 failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint order
> cobalt-trigger_ip-1 then nfs-grace-clone failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint order
> nfs-grace-clone then cobalt-cluster_ip-1 failed
> Sep 22 10:21:34 cobalt logger: warning pcs resource create
> iron-cluster_ip-1 ocf:heartbeat:IPaddr ip=10.100.30.102 cidr_netmask=32
> op monitor interval=15s failed
> Sep 22 10:21:34 cobalt logger: warning: pcs resource create
> iron-trigger_ip-1 ocf:heartbeat:Dummy failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint colocation add
> iron-cluster_ip-1 with iron-trigger_ip-1 failed
> Sep 22 10:21:34 cobalt logger: warning: pcs constraint order
> iron-trigger_ip-1 then nfs-grace-clone failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint order
> nfs-grace-clone then iron-cluster_ip-1 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 prefers iron=1000 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> cobalt-cluster_ip-1 prefers cobalt=2000 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 prefers cobalt=1000 failed
> Sep 22 10:21:35 cobalt logger: warning: pcs constraint location
> iron-cluster_ip-1 prefers iron=2000 failed
> Sep 22 10:21:35 cobalt logger: warning pcs cluster cib-push
> /tmp/tmp.yqLT4m75WG failed
>
> Notice the failed corosync service in bold. I can't find any logs
> pointing to a reason. Starting it manually is not a problem:
>
> Sep 22 10:35:06 cobalt corosync: Starting Corosync Cluster Engine
> (corosync): [  OK  ]
>
> Then I noticed pacemaker was not running on both nodes. Started it
> manually and saw the following in /var/log/messages on the other node:
>
> Sep 22 10:36:43 iron cibadmin[4654]: notice: Invoked: /usr/sbin/cibadmin
> --replace -o configuration -V --xml-pipe
> Sep 22 10:36:43 iron crmd[4617]: notice: State transition S_IDLE ->
> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
> origin=abort_transition_graph ]
> Sep 22 10:36:44 iron pengine[4616]: notice: On loss of CCM Quorum: Ignore
> Sep 22 10:36:44 iron pengine[4616]: error: Resource start-up disabled
> since no STONITH resources have been defined
> Sep 22 10:36:44 iron pengine[4616]: error: Either configure some or
> disable STONITH with the stonith-enabled option
> Sep 22 10:36:44 iron pengine[4616]: error: NOTE: Clusters with shared
> data need STONITH to ensure data integrity
> Sep 22 10:36:44 iron pengine[4616]: notice: Delaying fencing operations
> until there are resources to manage
> Sep 22 10:36:44 iron pengine[4616]: warning: Node iron is unclean!
> Sep 22 10:36:44 iron pengine[4616]: notice: Cannot fence unclean nodes
> until quorum is attained (or no-quorum-policy is set to ignore)
> Sep 22 10:36:44 iron pengine[4616]: warning: Calculated Transition 2:
> /var/lib/pacemaker/pengine/pe-warn-20.bz2
> Sep 22 10:36:44 iron pengine[4616]: notice: Configuration ERRORs found
> during PE processing.  Please run "crm_verify -L" to identify issues.
> Sep 22 10:36:44 iron crmd[4617]: notice: Transition 2 (Complete=0,
> Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-warn-20.bz2): Complete
> Sep 22 10:36:44 iron crmd[4617]: notice: State transition
> S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL
> origin=notify_crmd ]
>
> I'm starting to think there is some leftover config somewhere from all
> these attempts. Is there a way to completely reset all config related to
> NFS-Ganesha and start over?
>
>
If you disable nfs-ganesha , that should do the cleanup as well.
# gluster nfs-ganesha disable.

If you are still in doubt and to be safe, after disabling nfs-ganesha, 
run the below script command
# ./usr/libexec/ganesha/ganesha-ha.sh --cleanup /etc/ganesha

Thanks,
Soumya


>
> On 22 September 2015 at 09:04, Soumya Koduri <skoduri at redhat.com
> <mailto:skoduri at redhat.com>> wrote:
>
>     Hi Tiemen,
>
>     Have added the steps to configure HA NFS in the below doc. Please
>     verify if you have all the pre-requisites done & steps performed right.
>
>     https://github.com/soumyakoduri/glusterdocs/blob/ha_guide/Administrator%20Guide/Configuring%20HA%20NFS%20Server.md
>
>     Thanks,
>     Soumya
>
>     On 09/21/2015 09:21 PM, Tiemen Ruiten wrote:
>
>         Whoops, replied off-list.
>
>         Additionally I noticed that the generated corosync config is not
>         valid,
>         as there is no interface section:
>
>         /etc/corosync/corosync.conf
>
>         totem {
>         version: 2
>         secauth: off
>         cluster_name: rd-ganesha-ha
>         transport: udpu
>         }
>
>         nodelist {
>         Â  node {
>         Â  Â  Â  Â  ring0_addr: cobalt
>         Â  Â  Â  Â  nodeid: 1
>         Â  Â  Â  Â }
>         Â  node {
>         Â  Â  Â  Â  ring0_addr: iron
>         Â  Â  Â  Â  nodeid: 2
>         Â  Â  Â  Â }
>         }
>
>         quorum {
>         provider: corosync_votequorum
>         two_node: 1
>         }
>
>         logging {
>         to_syslog: yes
>         }
>
>
>
>
>         ---------- Forwarded message ----------
>         From: *Tiemen Ruiten* <t.ruiten at rdmedia.com
>         <mailto:t.ruiten at rdmedia.com> <mailto:t.ruiten at rdmedia.com
>         <mailto:t.ruiten at rdmedia.com>>>
>         Date: 21 September 2015 at 17:16
>         Subject: Re: [Gluster-users] nfs-ganesha HA with arbiter volume
>         To: Jiffin Tony Thottan <jthottan at redhat.com
>         <mailto:jthottan at redhat.com> <mailto:jthottan at redhat.com
>         <mailto:jthottan at redhat.com>>>
>
>
>         Could you point me to the latest documentation? I've been
>         struggling to
>         find something up-to-date. I believe I have all the prerequisites:
>
>         - shared storage volume exists and is mounted
>         - all nodes in hosts files
>         - Gluster-NFS disabled
>         - corosync, pacemaker and nfs-ganesha rpm's installed
>
>         Anything I missed?
>
>         Everything has been installed by RPM so is in the default locations:
>         /usr/libexec/ganesha/ganesha-ha.sh
>         /etc/ganesha/ganesha.conf (empty)
>         /etc/ganesha/ganesha-ha.conf
>
>         After I started the pcsd service manually, nfs-ganesha could be
>         enabled
>         successfully, but there was no virtual IP present on the
>         interfaces and
>         looking at the system log, I noticed corosync failed to start:
>
>         - on the host where I issued the gluster nfs-ganesha enable command:
>
>         Sep 21 17:07:18 iron systemd: Starting NFS-Ganesha file server...
>         Sep 21 17:07:19 iron systemd: Started NFS-Ganesha file server.
>         Sep 21 17:07:19 iron rpc.statd[2409]: Received SM_UNMON_ALL
>         request from
>         iron.int.rdmedia.com <http://iron.int.rdmedia.com>
>         <http://iron.int.rdmedia.com> while not monitoring
>         any hosts
>         Sep 21 17:07:20 iron systemd: Starting Corosync Cluster Engine...
>         Sep 21 17:07:20 iron corosync[3426]: [MAIN Â ] Corosync Cluster
>         Engine
>         ('2.3.4'): started and ready to provide service.
>         Sep 21 17:07:20 iron corosync[3426]: [MAIN Â ] Corosync built-in
>         features: dbus systemd xmlconf snmp pie relro bindnow
>         Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing transport
>         (UDP/IP Unicast).
>         Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing
>         transmit/receive security (NSS) crypto: none hash: none
>         Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] The network interface
>         [10.100.30.38] is now up.
>         Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
>         loaded:
>         corosync configuration map access [0]
>         Sep 21 17:07:20 iron corosync[3427]: [QB Â  Â ] server name: cmap
>         Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
>         loaded:
>         corosync configuration service [1]
>         Sep 21 17:07:20 iron corosync[3427]: [QB Â  Â ] server name: cfg
>         Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
>         loaded:
>         corosync cluster closed process group service v1.01 [2]
>         Sep 21 17:07:20 iron corosync[3427]: [QB Â  Â ] server name: cpg
>         Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
>         loaded:
>         corosync profile loading service [4]
>         Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Using quorum provider
>         corosync_votequorum
>         Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
>         loaded:
>         corosync vote quorum service v1.0 [5]
>         Sep 21 17:07:20 iron corosync[3427]: [QB Â  Â ] server name:
>         votequorum
>         Sep 21 17:07:20 iron corosync[3427]: [SERV Â ] Service engine
>         loaded:
>         corosync cluster quorum service v0.1 [3]
>         Sep 21 17:07:20 iron corosync[3427]: [QB Â  Â ] server name: quorum
>         Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU member
>         {10.100.30.38}
>         Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU member
>         {10.100.30.37}
>         Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership
>         (10.100.30.38:104 <http://10.100.30.38:104>
>         <http://10.100.30.38:104>) was formed. Members joined: 1
>         Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Members[1]: 1
>         Sep 21 17:07:20 iron corosync[3427]: [MAIN Â ] Completed service
>         synchronization, ready to provide service.
>         Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership
>         (10.100.30.37:108 <http://10.100.30.37:108>
>         <http://10.100.30.37:108>) was formed. Members joined: 1
>
>         Sep 21 17:08:21 iron corosync: Starting Corosync Cluster Engine
>         (corosync): [FAILED]
>         Sep 21 17:08:21 iron systemd: corosync.service: control process
>         exited,
>         code=exited status=1
>         Sep 21 17:08:21 iron systemd: Failed to start Corosync Cluster
>         Engine.
>         Sep 21 17:08:21 iron systemd: Unit corosync.service entered
>         failed state.
>
>
>         - on the other host:
>
>         Sep 21 17:07:19 cobalt systemd: Starting Preprocess NFS
>         configuration...
>         Sep 21 17:07:19 cobalt systemd: Starting RPC Port Mapper.
>         Sep 21 17:07:19 cobalt systemd: Reached target RPC Port Mapper.
>         Sep 21 17:07:19 cobalt systemd: Starting Host and Network Name
>         Lookups.
>         Sep 21 17:07:19 cobalt systemd: Reached target Host and Network Name
>         Lookups.
>         Sep 21 17:07:19 cobalt systemd: Starting RPC bind service...
>         Sep 21 17:07:19 cobalt systemd: Started Preprocess NFS
>         configuration.
>         Sep 21 17:07:19 cobalt systemd: Started RPC bind service.
>         Sep 21 17:07:19 cobalt systemd: Starting NFS status monitor for
>         NFSv2/3
>         locking....
>         Sep 21 17:07:19 cobalt rpc.statd[2662]: Version 1.3.0 starting
>         Sep 21 17:07:19 cobalt rpc.statd[2662]: Flags: TI-RPC
>         Sep 21 17:07:19 cobalt systemd: Started NFS status monitor for
>         NFSv2/3
>         locking..
>         Sep 21 17:07:19 cobalt systemd: Starting NFS-Ganesha file server...
>         Sep 21 17:07:19 cobalt systemd: Started NFS-Ganesha file server.
>         Sep 21 17:07:19 cobalt kernel: warning: `ganesha.nfsd' uses 32-bit
>         capabilities (legacy support in use)
>         Sep 21 17:07:19 cobalt logger: setting up rd-ganesha-ha
>         Sep 21 17:07:19 cobalt rpc.statd[2662]: Received SM_UNMON_ALL
>         request
>         from cobalt.int.rdmedia.com <http://cobalt.int.rdmedia.com>
>         <http://cobalt.int.rdmedia.com> while not
>         monitoring any hosts
>         Sep 21 17:07:19 cobalt logger: setting up cluster rd-ganesha-ha
>         with the
>         following cobalt iron
>         Sep 21 17:07:20 cobalt systemd: Stopped Pacemaker High Availability
>         Cluster Manager.
>         Sep 21 17:07:20 cobalt systemd: Stopped Corosync Cluster Engine.
>         Sep 21 17:07:20 cobalt systemd: Reloading.
>         Sep 21 17:07:20 cobalt systemd:
>         [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
>         'RemoveOnStop' in section 'Socket'
>         Sep 21 17:07:20 cobalt systemd:
>         [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
>         'RemoveOnStop' in section 'Socket'
>         Sep 21 17:07:20 cobalt systemd: Reloading.
>         Sep 21 17:07:20 cobalt systemd:
>         [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
>         'RemoveOnStop' in section 'Socket'
>         Sep 21 17:07:20 cobalt systemd:
>         [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
>         'RemoveOnStop' in section 'Socket'
>         Sep 21 17:07:20 cobalt systemd: Starting Corosync Cluster Engine...
>         Sep 21 17:07:20 cobalt corosync[2816]: [MAIN Â ] Corosync
>         Cluster Engine
>         ('2.3.4'): started and ready to provide service.
>         Sep 21 17:07:20 cobalt corosync[2816]: [MAIN Â ] Corosync built-in
>         features: dbus systemd xmlconf snmp pie relro bindnow
>         Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing
>         transport
>         (UDP/IP Unicast).
>         Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing
>         transmit/receive security (NSS) crypto: none hash: none
>         Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] The network
>         interface
>         [10.100.30.37] is now up.
>         Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
>         loaded:
>         corosync configuration map access [0]
>         Sep 21 17:07:21 cobalt corosync[2817]: [QB Â  Â ] server name: cmap
>         Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
>         loaded:
>         corosync configuration service [1]
>         Sep 21 17:07:21 cobalt corosync[2817]: [QB Â  Â ] server name: cfg
>         Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
>         loaded:
>         corosync cluster closed process group service v1.01 [2]
>         Sep 21 17:07:21 cobalt corosync[2817]: [QB Â  Â ] server name: cpg
>         Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
>         loaded:
>         corosync profile loading service [4]
>         Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Using quorum
>         provider
>         corosync_votequorum
>         Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
>         loaded:
>         corosync vote quorum service v1.0 [5]
>         Sep 21 17:07:21 cobalt corosync[2817]: [QB Â  Â ] server name:
>         votequorum
>         Sep 21 17:07:21 cobalt corosync[2817]: [SERV Â ] Service engine
>         loaded:
>         corosync cluster quorum service v0.1 [3]
>         Sep 21 17:07:21 cobalt corosync[2817]: [QB Â  Â ] server name:
>         quorum
>         Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU
>         member
>         {10.100.30.37}
>         Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU
>         member
>         {10.100.30.38}
>         Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership
>         (10.100.30.37:100 <http://10.100.30.37:100>
>         <http://10.100.30.37:100>) was formed. Members joined: 1
>         Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
>         Sep 21 17:07:21 cobalt corosync[2817]: [MAIN Â ] Completed service
>         synchronization, ready to provide service.
>         Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership
>         (10.100.30.37:108 <http://10.100.30.37:108>
>         <http://10.100.30.37:108>) was formed. Members joined: 1
>         Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>         cluster
>         members. Current votes: 1 expected_votes: 2
>         Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
>         Sep 21 17:07:21 cobalt corosync[2817]: [MAIN Â ] Completed service
>
>         synchronization, ready to provide service.
>         Sep 21 17:08:50 cobalt systemd: corosync.service operation timed
>         out.
>         Terminating.
>         Sep 21 17:08:50 cobalt corosync: Starting Corosync Cluster Engine
>         (corosync):
>         Sep 21 17:08:50 cobalt systemd: Failed to start Corosync Cluster
>         Engine.
>         Sep 21 17:08:50 cobalt systemd: Unit corosync.service entered
>         failed state.
>         Sep 21 17:08:55 cobalt logger: warning: pcs property set
>         no-quorum-policy=ignore failed
>         Sep 21 17:08:55 cobalt logger: warning: pcs property set
>         stonith-enabled=false failed
>         Sep 21 17:08:55 cobalt logger: warning: pcs resource create
>         nfs_start
>         ganesha_nfsd ha_vol_mnt=/var/run/gluster/shared_storage --clone
>         failed
>         Sep 21 17:08:56 cobalt logger: warning: pcs resource delete
>         nfs_start-clone failed
>         Sep 21 17:08:56 cobalt logger: warning: pcs resource create nfs-mon
>         ganesha_mon --clone failed
>         Sep 21 17:08:56 cobalt logger: warning: pcs resource create
>         nfs-grace
>         ganesha_grace --clone failed
>         Sep 21 17:08:57 cobalt logger: warning pcs resource create
>         cobalt-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op
>         monitor
>         interval=15s failed
>         Sep 21 17:08:57 cobalt logger: warning: pcs resource create
>         cobalt-trigger_ip-1 ocf:heartbeat:Dummy failed
>         Sep 21 17:08:57 cobalt logger: warning: pcs constraint
>         colocation add
>         cobalt-cluster_ip-1 with cobalt-trigger_ip-1 failed
>         Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
>         cobalt-trigger_ip-1 then nfs-grace-clone failed
>         Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
>         nfs-grace-clone then cobalt-cluster_ip-1 failed
>         Sep 21 17:08:57 cobalt logger: warning pcs resource create
>         iron-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op
>         monitor
>         interval=15s failed
>         Sep 21 17:08:57 cobalt logger: warning: pcs resource create
>         iron-trigger_ip-1 ocf:heartbeat:Dummy failed
>         Sep 21 17:08:57 cobalt logger: warning: pcs constraint
>         colocation add
>         iron-cluster_ip-1 with iron-trigger_ip-1 failed
>         Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
>         iron-trigger_ip-1 then nfs-grace-clone failed
>         Sep 21 17:08:58 cobalt logger: warning: pcs constraint order
>         nfs-grace-clone then iron-cluster_ip-1 failed
>         Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>         cobalt-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
>         Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>         cobalt-cluster_ip-1 prefers iron=1000 failed
>         Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>         cobalt-cluster_ip-1 prefers cobalt=2000 failed
>         Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>         iron-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failed
>         Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>         iron-cluster_ip-1 prefers cobalt=1000 failed
>         Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>         iron-cluster_ip-1 prefers iron=2000 failed
>         Sep 21 17:08:58 cobalt logger: warning pcs cluster cib-push
>         /tmp/tmp.nXTfyA1GMR failed
>         Sep 21 17:08:58 cobalt logger: warning: scp ganesha-ha.conf to
>         cobalt failed
>
>         BTW, I'm using CentOS 7. There are multiple network interfaces
>         on the
>         servers, could that be a problem?Â
>
>
>
>
>         On 21 September 2015 at 11:48, Jiffin Tony Thottan
>         <jthottan at redhat.com <mailto:jthottan at redhat.com>
>         <mailto:jthottan at redhat.com <mailto:jthottan at redhat.com>>> wrote:
>
>
>
>              On 21/09/15 13:56, Tiemen Ruiten wrote:
>
>                  Hello Soumya, Kaleb, list,
>
>                  This Friday I created the gluster_shared_storage volume
>             manually,
>                  I just tried it with the command you supplied, but both
>             have the
>                  same result:
>
>                  from etc-glusterfs-glusterd.vol.log on the node where I
>             issued the
>                  command:
>
>                  [2015-09-21 07:59:47.756845] I [MSGID: 106474]
>                  [glusterd-ganesha.c:403:check_host_list] 0-management:
>             ganesha
>                  host found Hostname is cobalt
>                  [2015-09-21 07:59:48.071755] I [MSGID: 106474]
>                  [glusterd-ganesha.c:349:is_ganesha_host] 0-management:
>             ganesha
>                  host found Hostname is cobalt
>                  [2015-09-21 07:59:48.653879] E [MSGID: 106470]
>                  [glusterd-ganesha.c:264:glusterd_op_set_ganesha]
>             0-management:
>                  Initial NFS-Ganesha set up failed
>
>
>              As far as what I understand from the logs, it called
>              setup_cluser()[calls `ganesha-ha.sh` script ] but script
>         failed.
>              Can u please provide following details :
>              -Location of ganesha.sh file??
>              -Location of ganesha-ha.conf, ganesha.conf files ?
>
>
>              And also can u cross check whether all the prerequisites
>         before HA
>              setup satisfied ?
>
>              --
>              With Regards,
>              Jiffin
>
>
>                  [2015-09-21 07:59:48.653912] E [MSGID: 106123]
>                  [glusterd-syncop.c:1404:gd_commit_op_phase]
>             0-management: Commit
>                  of operation 'Volume (null)' failed on localhost :
>             Failed to set
>                  up HA config for NFS-Ganesha. Please check the log file
>             for details
>                  [2015-09-21 07:59:45.402458] I [MSGID: 106006]
>                  [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify]
>                  0-management: nfs has disconnected from glusterd.
>                  [2015-09-21 07:59:48.071578] I [MSGID: 106474]
>                  [glusterd-ganesha.c:403:check_host_list] 0-management:
>             ganesha
>                  host found Hostname is cobalt
>
>                  from etc-glusterfs-glusterd.vol.log on the other node:
>
>                  [2015-09-21 08:12:50.111877] E [MSGID: 106062]
>                  [glusterd-op-sm.c:3698:glusterd_op_ac_unlock]
>             0-management: Unable
>                  to acquire volname
>                  [2015-09-21 08:14:50.548087] E [MSGID: 106062]
>                  [glusterd-op-sm.c:3635:glusterd_op_ac_lock]
>             0-management: Unable
>                  to acquire volname
>                  [2015-09-21 08:14:50.654746] I [MSGID: 106132]
>                  [glusterd-proc-mgmt.c:83:glusterd_proc_stop]
>             0-management: nfs
>                  already stopped
>                  [2015-09-21 08:14:50.655095] I [MSGID: 106474]
>                  [glusterd-ganesha.c:403:check_host_list] 0-management:
>             ganesha
>                  host found Hostname is cobalt
>                  [2015-09-21 08:14:51.287156] E [MSGID: 106062]
>                  [glusterd-op-sm.c:3698:glusterd_op_ac_unlock]
>             0-management: Unable
>                  to acquire volname
>
>
>                  from etc-glusterfs-glusterd.vol.log on the arbiter node:
>
>                  [2015-09-21 08:18:50.934713] E [MSGID: 101075]
>                  [common-utils.c:3127:gf_is_local_addr] 0-management:
>             error in
>                  getaddrinfo: Name or service not known
>                  [2015-09-21 08:18:51.504694] E [MSGID: 106062]
>                  [glusterd-op-sm.c:3698:glusterd_op_ac_unlock]
>             0-management: Unable
>                  to acquire volname
>
>                  I have put the hostnames of all servers in my
>             /etc/hosts file,
>                  including the arbiter node.
>
>
>                  On 18 September 2015 at 16:52, Soumya Koduri
>             <skoduri at redhat.com <mailto:skoduri at redhat.com>
>                  <mailto:skoduri at redhat.com
>             <mailto:skoduri at redhat.com>>> wrote:
>
>                      Hi Tiemen,
>
>                      One of the pre-requisites before setting up
>             nfs-ganesha HA is
>                      to create and mount shared_storage volume. Use
>             below CLI for that
>
>                      "gluster volume set all
>             cluster.enable-shared-storage enable"
>
>                      It shall create the volume and mount in all the nodes
>                      (including the arbiter node). Note this volume shall be
>                      mounted on all the nodes of the gluster storage
>             pool (though
>                      in this case it may not be part of nfs-ganesha
>             cluster).
>
>                      So instead of manually creating those directory
>             paths, please
>                      use above CLI and try re-configuring the setup.
>
>                      Thanks,
>                      Soumya
>
>                      On 09/18/2015 07:29 PM, Tiemen Ruiten wrote:
>
>                          Hello Kaleb,
>
>                          I don't:
>
>                          # Name of the HA cluster created.
>                          # must be unique within the subnet
>                          HA_NAME="rd-ganesha-ha"
>                          #
>                          # The gluster server from which to mount the
>             shared data
>                          volume.
>                          HA_VOL_SERVER="iron"
>                          #
>                          # N.B. you may use short names or long names;
>             you may not
>                          use IP addrs.
>                          # Once you select one, stay with it as it will
>             be mildly
>                          unpleasant to
>                          # clean up if you switch later on. Ensure that
>             all names -
>                          short and/or
>                          # long - are in DNS or /etc/hosts on all
>             machines in the
>                          cluster.
>                          #
>                          # The subset of nodes of the Gluster Trusted
>             Pool that
>                          form the ganesha
>                          # HA cluster. Hostname is specified.
>                          HA_CLUSTER_NODES="cobalt,iron"
>                          #HA_CLUSTER_NODES="server1.lab.redhat.com
>             <http://server1.lab.redhat.com>
>                          <http://server1.lab.redhat.com>
>
>             <http://server1.lab.redhat.com>,server2.lab.redhat.com
>             <http://server2.lab.redhat.com>
>                          <http://server2.lab.redhat.com>
>                          <http://server2.lab.redhat.com>,..."
>                          #
>                          # Virtual IPs for each of the nodes specified
>             above.
>                          VIP_server1="10.100.30.101"
>                          VIP_server2="10.100.30.102"
>                          #VIP_server1_lab_redhat_com="10.0.2.1"
>                          #VIP_server2_lab_redhat_com="10.0.2.2"
>
>                          hosts cobalt & iron are the data nodes, the arbiter
>                          ip/hostname (neon)
>                          isn't mentioned anywhere in this config file.
>
>
>                          On 18 September 2015 at 15:56, Kaleb S. KEITHLEY
>                          <<mailto:kkeithle at redhat.com
>             <mailto:kkeithle at redhat.com>>kkeithle at redhat.com
>             <mailto:kkeithle at redhat.com>
>                          <mailto:kkeithle at redhat.com
>             <mailto:kkeithle at redhat.com>>
>                          <mailto:kkeithle at redhat.com
>             <mailto:kkeithle at redhat.com> <mailto:kkeithle at redhat.com
>             <mailto:kkeithle at redhat.com>>>>
>                          wrote:
>
>                          Â  Â  On 09/18/2015 09:46 AM, Tiemen Ruiten wrote:
>                          Â  Â  > Hello,
>                          Â  Â  >
>                          Â  Â  > I have a Gluster cluster with a single
>             replica 3,
>                          arbiter 1 volume (so
>                          Â  Â  > two nodes with actual data, one arbiter
>             node). I
>                          would like to setup
>                          Â  Â  > NFS-Ganesha HA for this volume but I'm
>             having some
>                          difficulties.
>                          Â  Â  >
>                          Â  Â  > - I needed to create a directory
>                          /var/run/gluster/shared_storage
>                          Â  Â  > manually on all nodes, or the command
>             'gluster
>                          nfs-ganesha enable would
>                          Â  Â  > fail with the following error:
>                          Â  Â  > [2015-09-18 13:13:34.690416] E [MSGID:
>             106032]
>                          Â  Â  > [glusterd-ganesha.c:708:pre_setup]
>             0-THIS->name:
>                          mkdir() failed on path
>                          Â  Â  >
>             /var/run/gluster/shared_storage/nfs-ganesha, [No
>                          such file or directory]
>                          Â  Â  >
>                          Â  Â  > - Then I found out that the command
>             connects to
>                          the arbiter node as
>                          Â  Â  > well, but obviously I don't want to set up
>                          NFS-Ganesha there. Is it
>                          Â  Â  > actually possible to setup NFS-Ganesha
>             HA with an
>                          arbiter node? If it's
>                          Â  Â  > possible, is there any documentation on
>             how to do
>                          that?
>                          Â  Â  >
>
>                          Â  Â  Please send the
>             /etc/ganesha/ganesha-ha.conf file
>                          you're using.
>
>                          Â  Â  Probably you have included the arbiter in
>             your HA
>                          config; that would be
>                          Â  Â  a mistake.
>
>                          Â  Â  --
>
>                          Â  Â  Kaleb
>
>
>
>
>                          --
>                          Tiemen Ruiten
>                          Systems Engineer
>                          R&D Media
>
>
>                          _______________________________________________
>                          Gluster-users mailing list
>             Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>             <mailto:Gluster-users at gluster.org
>             <mailto:Gluster-users at gluster.org>>
>             http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>                  --
>                  Tiemen Ruiten
>                  Systems Engineer
>                  R&D Media
>
>
>                  _______________________________________________
>                  Gluster-users mailing list
>             Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>             <mailto:Gluster-users at gluster.org
>             <mailto:Gluster-users at gluster.org>>
>             http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>              _______________________________________________
>              Gluster-users mailing list
>         Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>         <mailto:Gluster-users at gluster.org
>         <mailto:Gluster-users at gluster.org>>
>         http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>         --
>         Tiemen Ruiten
>         Systems Engineer
>         R&D Media
>
>
>
>         --
>         Tiemen Ruiten
>         Systems Engineer
>         R&D Media
>
>
>         _______________________________________________
>         Gluster-users mailing list
>         Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>         http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media


More information about the Gluster-users mailing list