[Gluster-users] Questions on ganesha HA and shared storage size
Alessandro De Salvo
Alessandro.DeSalvo at roma1.infn.it
Tue Jun 9 10:27:41 UTC 2015
Hi,
> Il giorno 09/giu/2015, alle ore 11:46, Soumya Koduri <skoduri at redhat.com> ha scritto:
>
>
>
> On 06/09/2015 02:48 PM, Alessandro De Salvo wrote:
>> Hi,
>> OK, the problem with the VIPs not starting is due to the ganesha_mon
>> heartbeat script looking for a pid file called
>> /var/run/ganesha.nfsd.pid, while by default ganesha.nfsd v.2.2.0 is
>> creating /var/run/ganesha.pid, this needs to be corrected. The file is
>> in glusterfs-ganesha-3.7.1-1.el7.x86_64, in my case.
>> For the moment I have created a symlink in this way and it works:
>>
>> ln -s /var/run/ganesha.pid /var/run/ganesha.nfsd.pid
>>
> Thanks. Please update this as well in the bug.
Done :-)
>
>> So far so good, the VIPs are up and pingable, but still there is the
>> problem of the hanging showmount (i.e. hanging RPC).
>> Still, I see a lot of errors like this in /var/log/messages:
>>
>> Jun 9 11:15:20 atlas-node1 lrmd[31221]: notice: operation_finished:
>> nfs-mon_monitor_10000:29292:stderr [ Error: Resource does not exist. ]
>>
>> While ganesha.log shows the server is not in grace:
>>
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29964[main] main :MAIN :EVENT :ganesha.nfsd Starting:
>> Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.2.0/src, built at
>> May 18 2015 14:17:18 on buildhw-09.phx2.fedoraproject.org
>> <http://buildhw-09.phx2.fedoraproject.org>
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_set_param_from_conf :NFS STARTUP :EVENT
>> :Configuration file successfully parsed
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] init_server_pkgs :NFS STARTUP :EVENT
>> :Initializing ID Mapper.
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper
>> successfully initialized.
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] main :NFS STARTUP :WARN :No export entries
>> found in configuration file !!!
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] config_errs_to_log :CONFIG :WARN :Config File
>> ((null):0): Empty configuration file
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] lower_my_caps :NFS STARTUP :EVENT
>> :CAP_SYS_RESOURCE was successfully removed for proper quota management
>> in FSAL
>> 09/06/2015 11:16:20 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] lower_my_caps :NFS STARTUP :EVENT :currenty set
>> capabilities are: =
>> cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap+ep
>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Init_svc :DISP :CRIT :Cannot acquire
>> credentials for principal nfs
>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Init_admin_thread :NFS CB :EVENT :Admin
>> thread initialized
>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs4_start_grace :STATE :EVENT :NFS Server Now
>> IN GRACE, duration 60
>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_rpc_cb_init_ccache :NFS STARTUP :EVENT
>> :Callback creds directory (/var/run/ganesha) already exists
>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_rpc_cb_init_ccache :NFS STARTUP :WARN
>> :gssd_refresh_krb5_machine_credential failed (2:2)
>> 09/06/2015 11:16:21 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :Starting
>> delayed executor.
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :9P/TCP
>> dispatcher thread was started successfully
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[_9p_disp] _9p_dispatcher_thread :9P DISP :EVENT :9P
>> dispatcher started
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT
>> :gsh_dbusthread was started successfully
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :admin thread
>> was started successfully
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :reaper thread
>> was started successfully
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now IN
>> GRACE
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_Start_threads :THREAD :EVENT :General
>> fridge was started successfully
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT
>> :-------------------------------------------------
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT : NFS
>> SERVER INITIALIZED
>> 09/06/2015 11:16:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[main] nfs_start :NFS STARTUP :EVENT
>> :-------------------------------------------------
>> 09/06/2015 11:17:22 : epoch 5576aee4 : atlas-node1 :
>> ganesha.nfsd-29965[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now
>> NOT IN GRACE
>>
>>
> Please check the status of nfs-ganesha
> $service nfs-ganesha status
It’s fine:
# service nfs-ganesha status
Redirecting to /bin/systemctl status nfs-ganesha.service
nfs-ganesha.service - NFS-Ganesha file server
Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha.service; enabled)
Active: active (running) since Tue 2015-06-09 11:54:39 CEST; 32min ago
Docs: http://github.com/nfs-ganesha/nfs-ganesha/wiki
Process: 28081 ExecStop=/bin/dbus-send --system --dest=org.ganesha.nfsd --type=method_call /org/ganesha/nfsd/admin org.ganesha.nfsd.admin.shutdown (code=exited, status=0/SUCCESS)
Process: 28425 ExecStartPost=/bin/bash -c prlimit --pid $MAINPID --nofile=$NOFILE:$NOFILE (code=exited, status=0/SUCCESS)
Process: 28423 ExecStart=/usr/bin/ganesha.nfsd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 28424 (ganesha.nfsd)
CGroup: /system.slice/nfs-ganesha.service
ââ28424 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid
>
> Could you try taking a packet trace (during showmount or mount) and check the server responses.
The problem is that the portmapper seems to be working but the nothing happens:
3785 0.652843 x.x.x.2 -> x.x.x.1 Portmap 98 V2 GETPORT Call MOUNT(100005) V:3 TCP
3788 0.653339 x.x.x.1 -> x.x.x.2 Portmap 70 V2 GETPORT Reply (Call In 3785) Port:33645
3789 0.653756 x.x.x.2 -> x.x.x.1 TCP 74 50774 > 33645 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=73312128 TSecr=0 WS=128
3790 0.653784 x.x.x.1 -> x.x.x.2 TCP 74 33645 > 50774 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 SACK_PERM=1 TSval=132248576 TSecr=73312128 WS=128
3791 0.654004 x.x.x.2 -> x.x.x.1 TCP 66 50774 > 33645 [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=73312128 TSecr=132248576
3793 0.654174 x.x.x.2 -> x.x.x.1 MOUNT 158 V3 EXPORT Call
3794 0.654184 x.x.x.1 -> x.x.x.2 TCP 66 33645 > 50774 [ACK] Seq=1 Ack=93 Win=14592 Len=0 TSval=132248576 TSecr=73312129
86065 20.674219 x.x.x.2 -> x.x.x.1 TCP 66 50774 > 33645 [FIN, ACK] Seq=93 Ack=1 Win=29312 Len=0 TSval=73332149 TSecr=132248576
86247 20.713745 x.x.x.1 -> x.x.x.2 TCP 66 33645 > 50774 [ACK] Seq=1 Ack=94 Win=14592 Len=0 TSval=132268636 TSecr=73332149
Cheers,
Alessandro
>
> Thanks,
> Soumya
>
>> Cheers,
>>
>> Alessandro
>>
>>
>>> Il giorno 09/giu/2015, alle ore 10:36, Alessandro De Salvo
>>> <alessandro.desalvo at roma1.infn.it
>>> <mailto:alessandro.desalvo at roma1.infn.it>> ha scritto:
>>>
>>> Hi Soumya,
>>>
>>>> Il giorno 09/giu/2015, alle ore 08:06, Soumya Koduri
>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto:
>>>>
>>>>
>>>>
>>>> On 06/09/2015 01:31 AM, Alessandro De Salvo wrote:
>>>>> OK, I found at least one of the bugs.
>>>>> The /usr/libexec/ganesha/ganesha.sh has the following lines:
>>>>>
>>>>> if [ -e /etc/os-release ]; then
>>>>> RHEL6_PCS_CNAME_OPTION=""
>>>>> fi
>>>>>
>>>>> This is OK for RHEL < 7, but does not work for >= 7. I have changed
>>>>> it to the following, to make it working:
>>>>>
>>>>> if [ -e /etc/os-release ]; then
>>>>> eval $(grep -F "REDHAT_SUPPORT_PRODUCT=" /etc/os-release)
>>>>> [ "$REDHAT_SUPPORT_PRODUCT" == "Fedora" ] &&
>>>>> RHEL6_PCS_CNAME_OPTION=""
>>>>> fi
>>>>>
>>>> Oh..Thanks for the fix. Could you please file a bug for the same (and
>>>> probably submit your fix as well). We shall have it corrected.
>>>
>>> Just did it,https://bugzilla.redhat.com/show_bug.cgi?id=1229601
>>>
>>>>
>>>>> Apart from that, the VIP_<node> I was using were wrong, and I should
>>>>> have converted all the “-“ to underscores, maybe this could be
>>>>> mentioned in the documentation when you will have it ready.
>>>>> Now, the cluster starts, but the VIPs apparently not:
>>>>>
>>>> Sure. Thanks again for pointing it out. We shall make a note of it.
>>>>
>>>>> Online: [ atlas-node1 atlas-node2 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>> Clone Set: nfs-mon-clone [nfs-mon]
>>>>> Started: [ atlas-node1 atlas-node2 ]
>>>>> Clone Set: nfs-grace-clone [nfs-grace]
>>>>> Started: [ atlas-node1 atlas-node2 ]
>>>>> atlas-node1-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped
>>>>> atlas-node1-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1
>>>>> atlas-node2-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped
>>>>> atlas-node2-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2
>>>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1
>>>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2
>>>>>
>>>>> PCSD Status:
>>>>> atlas-node1: Online
>>>>> atlas-node2: Online
>>>>>
>>>>> Daemon Status:
>>>>> corosync: active/disabled
>>>>> pacemaker: active/disabled
>>>>> pcsd: active/enabled
>>>>>
>>>>>
>>>> Here corosync and pacemaker shows 'disabled' state. Can you check the
>>>> status of their services. They should be running prior to cluster
>>>> creation. We need to include that step in document as well.
>>>
>>> Ah, OK, you’re right, I have added it to my puppet modules (we install
>>> and configure ganesha via puppet, I’ll put the module on puppetforge
>>> soon, in case anyone is interested).
>>>
>>>>
>>>>> But the issue that is puzzling me more is the following:
>>>>>
>>>>> # showmount -e localhost
>>>>> rpc mount export: RPC: Timed out
>>>>>
>>>>> And when I try to enable the ganesha exports on a volume I get this
>>>>> error:
>>>>>
>>>>> # gluster volume set atlas-home-01 ganesha.enable on
>>>>> volume set: failed: Failed to create NFS-Ganesha export config file.
>>>>>
>>>>> But I see the file created in /etc/ganesha/exports/*.conf
>>>>> Still, showmount hangs and times out.
>>>>> Any help?
>>>>> Thanks,
>>>>>
>>>> Hmm that's strange. Sometimes, in case if there was no proper cleanup
>>>> done while trying to re-create the cluster, we have seen such issues.
>>>>
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1227709
>>>>
>>>> http://review.gluster.org/#/c/11093/
>>>>
>>>> Can you please unexport all the volumes, teardown the cluster using
>>>> 'gluster vol set <volname> ganesha.enable off’
>>>
>>> OK:
>>>
>>> # gluster vol set atlas-home-01 ganesha.enable off
>>> volume set: failed: ganesha.enable is already 'off'.
>>>
>>> # gluster vol set atlas-data-01 ganesha.enable off
>>> volume set: failed: ganesha.enable is already 'off'.
>>>
>>>
>>>> 'gluster ganesha disable' command.
>>>
>>> I’m assuming you wanted to write nfs-ganesha instead?
>>>
>>> # gluster nfs-ganesha disable
>>> ganesha enable : success
>>>
>>>
>>> A side note (not really important): it’s strange that when I do a
>>> disable the message is “ganesha enable” :-)
>>>
>>>>
>>>> Verify if the following files have been deleted on all the nodes-
>>>> '/etc/cluster/cluster.conf’
>>>
>>> this file is not present at all, I think it’s not needed in CentOS 7
>>>
>>>> '/etc/ganesha/ganesha.conf’,
>>>
>>> it’s still there, but empty, and I guess it should be OK, right?
>>>
>>>> '/etc/ganesha/exports/*’
>>>
>>> no more files there
>>>
>>>> '/var/lib/pacemaker/cib’
>>>
>>> it’s empty
>>>
>>>>
>>>> Verify if the ganesha service is stopped on all the nodes.
>>>
>>> nope, it’s still running, I will stop it.
>>>
>>>>
>>>> start/restart the services - corosync, pcs.
>>>
>>> In the node where I issued the nfs-ganesha disable there is no more
>>> any /etc/corosync/corosync.conf so corosync won’t start. The other
>>> node instead still has the file, it’s strange.
>>>
>>>>
>>>> And re-try the HA cluster creation
>>>> 'gluster ganesha enable’
>>>
>>> This time (repeated twice) it did not work at all:
>>>
>>> # pcs status
>>> Cluster name: ATLAS_GANESHA_01
>>> Last updated: Tue Jun 9 10:13:43 2015
>>> Last change: Tue Jun 9 10:13:22 2015
>>> Stack: corosync
>>> Current DC: atlas-node1 (1) - partition with quorum
>>> Version: 1.1.12-a14efad
>>> 2 Nodes configured
>>> 6 Resources configured
>>>
>>>
>>> Online: [ atlas-node1 atlas-node2 ]
>>>
>>> Full list of resources:
>>>
>>> Clone Set: nfs-mon-clone [nfs-mon]
>>> Started: [ atlas-node1 atlas-node2 ]
>>> Clone Set: nfs-grace-clone [nfs-grace]
>>> Started: [ atlas-node1 atlas-node2 ]
>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1
>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2
>>>
>>> PCSD Status:
>>> atlas-node1: Online
>>> atlas-node2: Online
>>>
>>> Daemon Status:
>>> corosync: active/enabled
>>> pacemaker: active/enabled
>>> pcsd: active/enabled
>>>
>>>
>>>
>>> I tried then "pcs cluster destroy" on both nodes, and then again
>>> nfs-ganesha enable, but now I’m back to the old problem:
>>>
>>> # pcs status
>>> Cluster name: ATLAS_GANESHA_01
>>> Last updated: Tue Jun 9 10:22:27 2015
>>> Last change: Tue Jun 9 10:17:00 2015
>>> Stack: corosync
>>> Current DC: atlas-node2 (2) - partition with quorum
>>> Version: 1.1.12-a14efad
>>> 2 Nodes configured
>>> 10 Resources configured
>>>
>>>
>>> Online: [ atlas-node1 atlas-node2 ]
>>>
>>> Full list of resources:
>>>
>>> Clone Set: nfs-mon-clone [nfs-mon]
>>> Started: [ atlas-node1 atlas-node2 ]
>>> Clone Set: nfs-grace-clone [nfs-grace]
>>> Started: [ atlas-node1 atlas-node2 ]
>>> atlas-node1-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped
>>> atlas-node1-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1
>>> atlas-node2-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped
>>> atlas-node2-trigger_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2
>>> atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1
>>> atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2
>>>
>>> PCSD Status:
>>> atlas-node1: Online
>>> atlas-node2: Online
>>>
>>> Daemon Status:
>>> corosync: active/enabled
>>> pacemaker: active/enabled
>>> pcsd: active/enabled
>>>
>>>
>>> Cheers,
>>>
>>> Alessandro
>>>
>>>>
>>>>
>>>> Thanks,
>>>> Soumya
>>>>
>>>>> Alessandro
>>>>>
>>>>>> Il giorno 08/giu/2015, alle ore 20:00, Alessandro De Salvo
>>>>>> <Alessandro.DeSalvo at roma1.infn.it
>>>>>> <mailto:Alessandro.DeSalvo at roma1.infn.it>> ha scritto:
>>>>>>
>>>>>> Hi,
>>>>>> indeed, it does not work :-)
>>>>>> OK, this is what I did, with 2 machines, running CentOS 7.1,
>>>>>> Glusterfs 3.7.1 and nfs-ganesha 2.2.0:
>>>>>>
>>>>>> 1) ensured that the machines are able to resolve their IPs (but
>>>>>> this was already true since they were in the DNS);
>>>>>> 2) disabled NetworkManager and enabled network on both machines;
>>>>>> 3) created a gluster shared volume 'gluster_shared_storage' and
>>>>>> mounted it on '/run/gluster/shared_storage' on all the cluster
>>>>>> nodes using glusterfs native mount (on CentOS 7.1 there is a link
>>>>>> by default /var/run -> ../run)
>>>>>> 4) created an empty /etc/ganesha/ganesha.conf;
>>>>>> 5) installed pacemaker pcs resource-agents corosync on all cluster
>>>>>> machines;
>>>>>> 6) set the ‘hacluster’ user the same password on all machines;
>>>>>> 7) pcs cluster auth <hostname> -u hacluster -p <pass> on all the
>>>>>> nodes (on both nodes I issued the commands for both nodes)
>>>>>> 8) IPv6 is configured by default on all nodes, although the
>>>>>> infrastructure is not ready for IPv6
>>>>>> 9) enabled pcsd and started it on all nodes
>>>>>> 10) populated /etc/ganesha/ganesha-ha.conf with the following
>>>>>> contents, one per machine:
>>>>>>
>>>>>>
>>>>>> ===> atlas-node1
>>>>>> # Name of the HA cluster created.
>>>>>> HA_NAME="ATLAS_GANESHA_01"
>>>>>> # The server from which you intend to mount
>>>>>> # the shared volume.
>>>>>> HA_VOL_SERVER=“atlas-node1"
>>>>>> # The subset of nodes of the Gluster Trusted Pool
>>>>>> # that forms the ganesha HA cluster. IP/Hostname
>>>>>> # is specified.
>>>>>> HA_CLUSTER_NODES=“atlas-node1,atlas-node2"
>>>>>> # Virtual IPs of each of the nodes specified above.
>>>>>> VIP_atlas-node1=“x.x.x.1"
>>>>>> VIP_atlas-node2=“x.x.x.2"
>>>>>>
>>>>>> ===> atlas-node2
>>>>>> # Name of the HA cluster created.
>>>>>> HA_NAME="ATLAS_GANESHA_01"
>>>>>> # The server from which you intend to mount
>>>>>> # the shared volume.
>>>>>> HA_VOL_SERVER=“atlas-node2"
>>>>>> # The subset of nodes of the Gluster Trusted Pool
>>>>>> # that forms the ganesha HA cluster. IP/Hostname
>>>>>> # is specified.
>>>>>> HA_CLUSTER_NODES=“atlas-node1,atlas-node2"
>>>>>> # Virtual IPs of each of the nodes specified above.
>>>>>> VIP_atlas-node1=“x.x.x.1"
>>>>>> VIP_atlas-node2=“x.x.x.2”
>>>>>>
>>>>>> 11) issued gluster nfs-ganesha enable, but it fails with a cryptic
>>>>>> message:
>>>>>>
>>>>>> # gluster nfs-ganesha enable
>>>>>> Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the
>>>>>> trusted pool. Do you still want to continue? (y/n) y
>>>>>> nfs-ganesha: failed: Failed to set up HA config for NFS-Ganesha.
>>>>>> Please check the log file for details
>>>>>>
>>>>>> Looking at the logs I found nothing really special but this:
>>>>>>
>>>>>> ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
>>>>>> [2015-06-08 17:57:15.672844] I [MSGID: 106132]
>>>>>> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs
>>>>>> already stopped
>>>>>> [2015-06-08 17:57:15.675395] I
>>>>>> [glusterd-ganesha.c:386:check_host_list] 0-management: ganesha host
>>>>>> found Hostname is atlas-node2
>>>>>> [2015-06-08 17:57:15.720692] I
>>>>>> [glusterd-ganesha.c:386:check_host_list] 0-management: ganesha host
>>>>>> found Hostname is atlas-node2
>>>>>> [2015-06-08 17:57:15.721161] I
>>>>>> [glusterd-ganesha.c:335:is_ganesha_host] 0-management: ganesha host
>>>>>> found Hostname is atlas-node2
>>>>>> [2015-06-08 17:57:16.633048] E
>>>>>> [glusterd-ganesha.c:254:glusterd_op_set_ganesha] 0-management:
>>>>>> Initial NFS-Ganesha set up failed
>>>>>> [2015-06-08 17:57:16.641563] E
>>>>>> [glusterd-syncop.c:1396:gd_commit_op_phase] 0-management: Commit of
>>>>>> operation 'Volume (null)' failed on localhost : Failed to set up HA
>>>>>> config for NFS-Ganesha. Please check the log file for details
>>>>>>
>>>>>> ==> /var/log/glusterfs/cmd_history.log <==
>>>>>> [2015-06-08 17:57:16.643615] : nfs-ganesha enable : FAILED :
>>>>>> Failed to set up HA config for NFS-Ganesha. Please check the log
>>>>>> file for details
>>>>>>
>>>>>> ==> /var/log/glusterfs/cli.log <==
>>>>>> [2015-06-08 17:57:16.643839] I [input.c:36:cli_batch] 0-: Exiting
>>>>>> with: -1
>>>>>>
>>>>>>
>>>>>> Also, pcs seems to be fine for the auth part, although it obviously
>>>>>> tells me the cluster is not running.
>>>>>>
>>>>>> I, [2015-06-08T19:57:16.305323 #7223] INFO -- : Running:
>>>>>> /usr/sbin/corosync-cmapctl totem.cluster_name
>>>>>> I, [2015-06-08T19:57:16.345457 #7223] INFO -- : Running:
>>>>>> /usr/sbin/pcs cluster token-nodes
>>>>>> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET
>>>>>> /remote/check_auth HTTP/1.1" 200 68 0.1919
>>>>>> ::ffff:141.108.38.46 - - [08/Jun/2015 19:57:16] "GET
>>>>>> /remote/check_auth HTTP/1.1" 200 68 0.1920
>>>>>> atlas-node1.mydomain - - [08/Jun/2015:19:57:16 CEST] "GET
>>>>>> /remote/check_auth HTTP/1.1" 200 68
>>>>>> - -> /remote/check_auth
>>>>>>
>>>>>>
>>>>>> What am I doing wrong?
>>>>>> Thanks,
>>>>>>
>>>>>> Alessandro
>>>>>>
>>>>>>> Il giorno 08/giu/2015, alle ore 19:30, Soumya Koduri
>>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 06/08/2015 08:20 PM, Alessandro De Salvo wrote:
>>>>>>>> Sorry, just another question:
>>>>>>>>
>>>>>>>> - in my installation of gluster 3.7.1 the command gluster
>>>>>>>> features.ganesha enable does not work:
>>>>>>>>
>>>>>>>> # gluster features.ganesha enable
>>>>>>>> unrecognized word: features.ganesha (position 0)
>>>>>>>>
>>>>>>>> Which version has full support for it?
>>>>>>>
>>>>>>> Sorry. This option has recently been changed. It is now
>>>>>>>
>>>>>>> $ gluster nfs-ganesha enable
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> - in the documentation the ccs and cman packages are required,
>>>>>>>> but they seems not to be available anymore on CentOS 7 and
>>>>>>>> similar, I guess they are not really required anymore, as pcs
>>>>>>>> should do the full job
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Alessandro
>>>>>>>
>>>>>>> Looks like so from http://clusterlabs.org/quickstart-redhat.html.
>>>>>>> Let us know if it doesn't work.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Soumya
>>>>>>>
>>>>>>>>
>>>>>>>>> Il giorno 08/giu/2015, alle ore 15:09, Alessandro De Salvo
>>>>>>>>> <alessandro.desalvo at roma1.infn.it
>>>>>>>>> <mailto:alessandro.desalvo at roma1.infn.it>> ha scritto:
>>>>>>>>>
>>>>>>>>> Great, many thanks Soumya!
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Alessandro
>>>>>>>>>
>>>>>>>>>> Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri
>>>>>>>>>> <skoduri at redhat.com <mailto:skoduri at redhat.com>> ha scritto:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Please find the slides of the demo video at [1]
>>>>>>>>>>
>>>>>>>>>> We recommend to have a distributed replica volume as a shared
>>>>>>>>>> volume for better data-availability.
>>>>>>>>>>
>>>>>>>>>> Size of the volume depends on the workload you may have. Since
>>>>>>>>>> it is used to maintain states of NLM/NFSv4 clients, you may
>>>>>>>>>> calculate the size of the volume to be minimum of aggregate of
>>>>>>>>>> (typical_size_of'/var/lib/nfs'_directory +
>>>>>>>>>> ~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point)
>>>>>>>>>>
>>>>>>>>>> We shall document about this feature sooner in the gluster docs
>>>>>>>>>> as well.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Soumya
>>>>>>>>>>
>>>>>>>>>> [1] - http://www.slideshare.net/SoumyaKoduri/high-49117846
>>>>>>>>>>
>>>>>>>>>> On 06/08/2015 04:34 PM, Alessandro De Salvo wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> I have seen the demo video on ganesha HA,
>>>>>>>>>>> https://www.youtube.com/watch?v=Z4mvTQC-efM
>>>>>>>>>>> However there is no advice on the appropriate size of the
>>>>>>>>>>> shared volume. How is it really used, and what should be a
>>>>>>>>>>> reasonable size for it?
>>>>>>>>>>> Also, are the slides from the video available somewhere, as
>>>>>>>>>>> well as a documentation on all this? I did not manage to find
>>>>>>>>>>> them.
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Alessandro
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1770 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150609/c49281aa/attachment.p7s>
More information about the Gluster-users
mailing list