[Gluster-users] [Nfs-ganesha-devel] Questions on ganesha HA and shared storage size

Mon Jun 15 16:47:11 UTC 2015

We do run ganesha on RHEL7.0 (same as CentOS7.0), and I don't think 7.1
would be much different. We do run GPFS FSAL only (no VFS_FSAL).

Regards, Malahal.

Alessandro De Salvo [Alessandro.DeSalvo at roma1.infn.it] wrote:
> Hi,
> any news on this? Did you have the chance to look into that?
> I'd also be curious to know if anyone tried nfs ganesha on CentOS 7.1
> and if it was really working, as I also tried on a standalone, clean
> machine, and I see the very same behavior, even without gluster.
> Thanks,
> 
> 	Alessandro
> 
> On Fri, 2015-06-12 at 14:34 +0200, Alessandro De Salvo wrote:
> > Hi,
> > looking at the code and having recompiled adding some more debug, I
> > might be wrong, but it seems that in nfs_rpc_dispatcher_thread.c,
> > fuction nfs_rpc_dequeue_req, the threads enter the while (!(wqe->flags &
> > Wqe_LFlag_SyncDone)) and never exit from there.
> > I do not know if it's normal or not as I should read better the code.
> > Cheers,
> > 
> > 	Alessandro
> > 
> > On Fri, 2015-06-12 at 09:35 +0200, Alessandro De Salvo wrote:
> > > Hi Malahal,
> > > 
> > > 
> > > > Il giorno 12/giu/2015, alle ore 01:23, Malahal Naineni <malahal at us.ibm.com> ha scritto:
> > > > 
> > > > The logs indicate that ganesha was started successfully without any
> > > > exports.  gstack output seemed normal as well -- threads were waiting to
> > > > serve requests.
> > > 
> > > Yes, no exports as it was the default config before enabling Ganesha on any gluster volume.
> > > 
> > > > 
> > > > Assuming that you are running "showmount -e" on the same system, there
> > > > shouldn't be any firewall coming into the picture.
> > > 
> > > Yes it was the case in my last attempt, from the same machine. I also tried from another machine, but the result was the same. The firewall (firewalld, as it's a CentOS 7.1) is disabled anyways.
> > > 
> > > > If you are running
> > > > "showmount" from some other system, make sure there is no firewall
> > > > dropping the packets.
> > > > 
> > > > I think you need tcpdump trace to figure out the problem. My wireshark
> > > > trace showed two requests from the client to complete the "showmount -e"
> > > > command:
> > > > 
> > > > 1. Client sent "GETPORT" call to port 111 (rpcbind) to get the port number
> > > >   of MOUNT.
> > > > 2. Then it sent "EXPORT" call to mountd port (port it got in response to #1).
> > > 
> > > Yes, I did it already, and indeed it showed the two requests, so the portmapper works fine, but it hangs on the second request.
> > > Also "rpcinfo -t localhost portmapper" returns successfully, while "rpcinfo -t localhost nfs" hangs.
> > > The output of rpcinfo -p is the following:
> > > 
> > >     program vers proto   port  service
> > >     100000    4   tcp    111  portmapper
> > >     100000    3   tcp    111  portmapper
> > >     100000    2   tcp    111  portmapper
> > >     100000    4   udp    111  portmapper
> > >     100000    3   udp    111  portmapper
> > >     100000    2   udp    111  portmapper
> > >     100024    1   udp  56082  status
> > >     100024    1   tcp  41858  status
> > >     100003    3   udp   2049  nfs
> > >     100003    3   tcp   2049  nfs
> > >     100003    4   udp   2049  nfs
> > >     100003    4   tcp   2049  nfs
> > >     100005    1   udp  45611  mountd
> > >     100005    1   tcp  55915  mountd
> > >     100005    3   udp  45611  mountd
> > >     100005    3   tcp  55915  mountd
> > >     100021    4   udp  48775  nlockmgr
> > >     100021    4   tcp  51621  nlockmgr
> > >     100011    1   udp   4501  rquotad
> > >     100011    1   tcp   4501  rquotad
> > >     100011    2   udp   4501  rquotad
> > >     100011    2   tcp   4501  rquotad
> > > 
> > > > 
> > > > What does "rpcinfo -p <server-ip>" show?
> > > > 
> > > > Do you have selinux enabled? I am not sure if that is playing any role
> > > > here...
> > > 
> > > Nope, it's disabled:
> > > 
> > > # uname -a
> > > Linux node2 3.10.0-229.4.2.el7.x86_64 #1 SMP Wed May 13 10:06:09 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> > > 
> > > 
> > > Thanks for the help,
> > > 
> > >     Alessandro
> > > 
> > > > 
> > > > Regards, Malahal.
> > > > 
> > > > Alessandro De Salvo [Alessandro.DeSalvo at roma1.infn.it] wrote:
> > > >> Hi,
> > > >> this was an extract from the old logs, before Soumya's suggestion of
> > > >> changing the rquota port in the conf file. The new logs are attached
> > > >> (ganesha-20150611.log.gz) as well as the gstack of the ganesha process
> > > >> while I was executing the hanging showmount
> > > >> (ganesha-20150611.gstack.gz).
> > > >> Thanks,
> > > >> 
> > > >>    Alessandro
> > > >> 
> > > >> 
> > > >> 
> > > >>> On Thu, 2015-06-11 at 11:37 -0500, Malahal Naineni wrote:
> > > >>> Soumya Koduri [skoduri at redhat.com] wrote:
> > > >>>> CCin ganesha-devel to get more inputs.
> > > >>>> 
> > > >>>> In case of ipv6 enabled, only v6 interfaces are used by NFS-Ganesha.
> > > >>> 
> > > >>> I am not a network expert but I have seen IPv4 traffic over IPv6
> > > >>> interface while fixing few things before. This may be normal.
> > > >>> 
> > > >>>> commit - git show 'd7e8f255' , which got added in v2.2 has more details.
> > > >>>> 
> > > >>>>> # netstat -ltaupn | grep 2049
> > > >>>>> tcp6       4      0 :::2049                 :::*
> > > >>>>> LISTEN      32080/ganesha.nfsd
> > > >>>>> tcp6       1      0 x.x.x.2:2049      x.x.x.2:33285     CLOSE_WAIT
> > > >>>>> -
> > > >>>>> tcp6       1      0 127.0.0.1:2049          127.0.0.1:39555
> > > >>>>> CLOSE_WAIT  -
> > > >>>>> udp6       0      0 :::2049                 :::*
> > > >>>>> 32080/ganesha.nfsd
> > > >>>> 
> > > >>>>>>> I have enabled the full debug already, but I see nothing special. Before exporting any volume the log shows no error, even when I do a showmount (the log is attached, ganesha.log.gz). If I do the same after exporting a volume nfs-ganesha does not even start, complaining for not being able to bind the IPv6 ruota socket, but in fact there is nothing listening on IPv6, so it should not happen:
> > > >>>>>>> 
> > > >>>>>>> tcp6       0      0 :::111                  :::*                    LISTEN      7433/rpcbind
> > > >>>>>>> tcp6       0      0 :::2224                 :::*                    LISTEN      9054/ruby
> > > >>>>>>> tcp6       0      0 :::22                   :::*                    LISTEN      1248/sshd
> > > >>>>>>> udp6       0      0 :::111                  :::*                                7433/rpcbind
> > > >>>>>>> udp6       0      0 fe80::8c2:27ff:fef2:123 :::*                                31238/ntpd
> > > >>>>>>> udp6       0      0 fe80::230:48ff:fed2:123 :::*                                31238/ntpd
> > > >>>>>>> udp6       0      0 fe80::230:48ff:fed2:123 :::*                                31238/ntpd
> > > >>>>>>> udp6       0      0 fe80::230:48ff:fed2:123 :::*                                31238/ntpd
> > > >>>>>>> udp6       0      0 ::1:123                 :::*                                31238/ntpd
> > > >>>>>>> udp6       0      0 fe80::5484:7aff:fef:123 :::*                                31238/ntpd
> > > >>>>>>> udp6       0      0 :::123                  :::*                                31238/ntpd
> > > >>>>>>> udp6       0      0 :::824                  :::*                                7433/rpcbind
> > > >>>>>>> 
> > > >>>>>>> The error, as shown in the attached ganesha-after-export.log.gz logfile, is the following:
> > > >>>>>>> 
> > > >>>>>>> 
> > > >>>>>>> 10/06/2015 02:07:47 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] Bind_sockets_V6 :DISP :WARN :Cannot bind RQUOTA tcp6 socket, error 98 (Address already in use)
> > > >>>>>>> 10/06/2015 02:07:47 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] Bind_sockets :DISP :FATAL :Error binding to V6 interface. Cannot continue.
> > > >>>>>>> 10/06/2015 02:07:48 : epoch 55777fb5 : node2 : ganesha.nfsd-26195[main] glusterfs_unload :FSAL :DEBUG :FSAL Gluster unloaded
> > > >>> 
> > > >>> The above messages indicate that someone tried to restart ganesha. But
> > > >>> ganesha failed to come up because RQUOTA port (default is 875) is
> > > >>> already in use by an old ganesha instance or some other program holding
> > > >>> it. The new instance of ganesha will die, but if you are using systemd,
> > > >>> it will try to restart automatically. We have disabled systemd auto
> > > >>> restart in our environment as it was causing issues for debugging.
> > > >>> 
> > > >>> What version of ganesha is this?
> > > >>> 
> > > >>> Regards, Malahal.
> > > > 
> > > > 
> > > > 
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-users
> > 
> > 
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> 
>