[Gluster-users] NFS not start on localhost

Thu Oct 23 17:03:57 UTC 2014

Could you also provide the output of this command:
$ mount | column -t

--
Gene Liverman
Systems Administrator
Information Technology Services
University of West Georgia
gliverma at westga.edu

ITS: Making Technology Work for You!

This e-mail and any attachments may contain confidential and privileged
information. If you are not the intended recipient, please notify the
sender immediately by return mail, delete this message, and destroy any
copies. Any dissemination or use of this information by a person other than
the intended recipient is unauthorized and may be illegal or actionable by
law.

On Oct 23, 2014 10:07 AM, "Niels de Vos" <ndevos at redhat.com> wrote:

> The only way I can manage to hit this issue too, is by mounting an
> NFS-export on the Gluster server that starts the Gluster/NFS process.
> There is not crash happening on my side, Gluster/NFS just fails to
> start.
>
> Steps to reproduce:
> 1. mount -t nfs nas.example.net:/export /mnt
> 2. systemctl start glusterd
>
> After this, the error about being unable to register NLM4 is in
> /var/log/glusterfs/nfs.log.
>
> This is expected, because the Linux kernel NFS-server requires an NLM
> service in portmap/rpcbind (nlockmgr). You can verify what process
> occupies the service slot in rpcbind like this:
>
> 1. list the rpc-programs and their port numbers
>
>     # rpcinfo -p
>
> 2. check the process that listens on the TCP-port for nlockmgr (port
>    32770 was returned by the command from point 1)
>
>     # netstat -nlpt | grep -w 32770
>
> If the right column in the output lists 'glusterfs', then the
> Gluster/NFS process could register successfully and is handling the NLM4
> calls. However, if the right columnt contains a single '-', the Linux
> kernel module 'lockd' is handling the NLM4 calls. Gluster/NFS can not
> work together with the Linux kernel NFS-client (mountpoint) or the Linux
> kernel NFS-server.
>
> Does this help? If something is unclear, post the output  if the above
> commands and tell us what further details you want to see clarified.
>
> Cheers,
> Niels
>
>
> On Mon, Oct 20, 2014 at 12:53:46PM +0200, Demeter Tibor wrote:
> >
> > Hi,
> >
> > Thank you for you reply.
> >
> > I did your recommendations, but there are no changes.
> >
> > In the nfs.log there are no new things.
> >
> >
> > [root at node0 glusterfs]# reboot
> > Connection to 172.16.0.10 closed by remote host.
> > Connection to 172.16.0.10 closed.
> > [tdemeter at sirius-31 ~]$ ssh root at 172.16.0.10
> > root at 172.16.0.10's password:
> > Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106
> > [root at node0 ~]# systemctl status nfs.target
> > nfs.target - Network File System Server
> >    Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled)
> >    Active: inactive (dead)
> >
> > [root at node0 ~]# gluster volume status engine
> > Status of volume: engine
> > Gluster process                                               Port
> Online  Pid
> >
> ------------------------------------------------------------------------------
> > Brick gs00.itsmart.cloud:/gluster/engine0             50160   Y
>  3271
> > Brick gs01.itsmart.cloud:/gluster/engine1             50160   Y       595
> > NFS Server on localhost                                       N/A     N
>      N/A
> > Self-heal Daemon on localhost                         N/A     Y
>  3286
> > NFS Server on gs01.itsmart.cloud                      2049    Y
>  6951
> > Self-heal Daemon on gs01.itsmart.cloud                        N/A     Y
>      6958
> >
> > Task Status of Volume engine
> >
> ------------------------------------------------------------------------------
> > There are no active volume tasks
> >
> > [root at node0 ~]# systemctl status
> > Display all 262 possibilities? (y or n)
> > [root at node0 ~]# systemctl status nfs-lock
> > nfs-lock.service - NFS file locking service.
> >    Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled)
> >    Active: inactive (dead)
> >
> > [root at node0 ~]# systemctl stop nfs-lock
> > [root at node0 ~]# systemctl restart gluster
> > glusterd.service    glusterfsd.service  gluster.mount
> > [root at node0 ~]# systemctl restart gluster
> > glusterd.service    glusterfsd.service  gluster.mount
> > [root at node0 ~]# systemctl restart glusterfsd.service
> > [root at node0 ~]# systemctl restart glusterd.service
> > [root at node0 ~]# gluster volume status engine
> > Status of volume: engine
> > Gluster process                                               Port
> Online  Pid
> >
> ------------------------------------------------------------------------------
> > Brick gs00.itsmart.cloud:/gluster/engine0             50160   Y
>  5140
> > Brick gs01.itsmart.cloud:/gluster/engine1             50160   Y
>  2037
> > NFS Server on localhost                                       N/A     N
>      N/A
> > Self-heal Daemon on localhost                         N/A     N       N/A
> > NFS Server on gs01.itsmart.cloud                      2049    Y
>  6951
> > Self-heal Daemon on gs01.itsmart.cloud                        N/A     Y
>      6958
> >
> >
> > Any other idea?
> >
> > Tibor
> >
> >
> >
> >
> >
> >
> >
> >
> > ----- Eredeti üzenet -----
> > > On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote:
> > > > Hi,
> > > >
> > > > This is the full nfs.log after delete & reboot.
> > > > It is refers to portmap registering problem.
> > > >
> > > > [root at node0 glusterfs]# cat nfs.log
> > > > [2014-10-20 06:48:43.221136] I [glusterfsd.c:1959:main]
> > > > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
> 3.5.2
> > > > (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p
> > > > /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
> > > > /var/run/567e0bba7ad7102eae3049e2ad6c3ed7.socket)
> > > > [2014-10-20 06:48:43.224444] I [socket.c:3561:socket_init]
> > > > 0-socket.glusterfsd: SSL support is NOT enabled
> > > > [2014-10-20 06:48:43.224475] I [socket.c:3576:socket_init]
> > > > 0-socket.glusterfsd: using system polling thread
> > > > [2014-10-20 06:48:43.224654] I [socket.c:3561:socket_init]
> 0-glusterfs: SSL
> > > > support is NOT enabled
> > > > [2014-10-20 06:48:43.224667] I [socket.c:3576:socket_init]
> 0-glusterfs:
> > > > using system polling thread
> > > > [2014-10-20 06:48:43.235876] I
> > > > [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:
> Configured
> > > > rpc.outstanding-rpc-limit with value 16
> > > > [2014-10-20 06:48:43.254087] I [socket.c:3561:socket_init]
> > > > 0-socket.nfs-server: SSL support is NOT enabled
> > > > [2014-10-20 06:48:43.254116] I [socket.c:3576:socket_init]
> > > > 0-socket.nfs-server: using system polling thread
> > > > [2014-10-20 06:48:43.255241] I [socket.c:3561:socket_init]
> > > > 0-socket.nfs-server: SSL support is NOT enabled
> > > > [2014-10-20 06:48:43.255264] I [socket.c:3576:socket_init]
> > > > 0-socket.nfs-server: using system polling thread
> > > > [2014-10-20 06:48:43.257279] I [socket.c:3561:socket_init]
> > > > 0-socket.nfs-server: SSL support is NOT enabled
> > > > [2014-10-20 06:48:43.257315] I [socket.c:3576:socket_init]
> > > > 0-socket.nfs-server: using system polling thread
> > > > [2014-10-20 06:48:43.258135] I [socket.c:3561:socket_init]
> 0-socket.NLM:
> > > > SSL support is NOT enabled
> > > > [2014-10-20 06:48:43.258157] I [socket.c:3576:socket_init]
> 0-socket.NLM:
> > > > using system polling thread
> > > > [2014-10-20 06:48:43.293724] E
> > > > [rpcsvc.c:1314:rpcsvc_program_register_portmap] 0-rpc-service: Could
> not
> > > > register with portmap
> > > > [2014-10-20 06:48:43.293760] E [nfs.c:332:nfs_init_versions] 0-nfs:
> Program
> > > > NLM4 registration failed
> > >
> > > The above line suggests that there already is a service registered at
> > > portmapper for the NLM4 program/service. This happens when the kernel
> > > module 'lockd' is loaded. The kernel NFS-client and NFS-server depend
> on
> > > this, but unfortunately it conflicts with the Gluster/nfs server.
> > >
> > > Could you verify that the module is loaded?
> > >  - use 'lsmod | grep lockd' to check the modules
> > >  - use 'rpcinfo | grep nlockmgr' to check the rpcbind registrations
> > >
> > > Make sure that you do not mount any NFS exports on the Gluster server.
> > > Unmount all NFS mounts.
> > >
> > > You mentioned you are running CentOS-7, which is systemd based. You
> > > should be able to stop any conflicting NFS services like this:
> > >
> > >  # systemctl stop nfs-lock.service
> > >  # systemctl stop nfs.target
> > >  # systemctl disable nfs.target
> > >
> > > If all these services cleanup themselves, you should be able to start
> > > the Gluster/nfs service:
> > >
> > >   # systemctl restart glusterd.service
> > >
> > > In case some bits are still lingering around, it might be easier to
> > > reboot after disabling the 'nfs.target'.
> > >
> > > > [2014-10-20 06:48:43.293771] E [nfs.c:1312:init] 0-nfs: Failed to
> > > > initialize protocols
> > > > [2014-10-20 06:48:43.293777] E [xlator.c:403:xlator_init]
> 0-nfs-server:
> > > > Initialization of volume 'nfs-server' failed, review your volfile
> again
> > > > [2014-10-20 06:48:43.293783] E [graph.c:307:glusterfs_graph_init]
> > > > 0-nfs-server: initializing translator failed
> > > > [2014-10-20 06:48:43.293789] E [graph.c:502:glusterfs_graph_activate]
> > > > 0-graph: init failed
> > > > pending frames:
> > > > frame : type(0) op(0)
> > > >
> > > > patchset: git://git.gluster.com/glusterfs.git
> > > > signal received: 11
> > > > time of crash: 2014-10-20 06:48:43configuration details:
> > > > argp 1
> > > > backtrace 1
> > > > dlfcn 1
> > > > fdatasync 1
> > > > libpthread 1
> > > > llistxattr 1
> > > > setfsid 1
> > > > spinlock 1
> > > > epoll.h 1
> > > > xattr.h 1
> > > > st_atim.tv_nsec 1
> > > > package-string: glusterfs 3.5.2
> > > > [root at node0 glusterfs]# systemctl status portma
> > > > portma.service
> > > >    Loaded: not-found (Reason: No such file or directory)
> > > >    Active: inactive (dead)
> > > >
> > > >
> > > >
> > > > Also I have checked the rpcbind service.
> > > >
> > > > [root at node0 glusterfs]# systemctl status rpcbind.service
> > > > rpcbind.service - RPC bind service
> > > >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)
> > > >    Active: active (running) since h 2014-10-20 08:48:39 CEST; 2min
> 52s ago
> > > >   Process: 1940 ExecStart=/sbin/rpcbind -w ${RPCBIND_ARGS}
> (code=exited,
> > > >   status=0/SUCCESS)
> > > >  Main PID: 1946 (rpcbind)
> > > >    CGroup: /system.slice/rpcbind.service
> > > >            └─1946 /sbin/rpcbind -w
> > > >
> > > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Starting RPC bind
> > > > service...
> > > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Started RPC bind
> service.
> > > >
> > > > The restart does not solve this problem.
> > > >
> > > >
> > > > I think this is the problem. Why are "exited" the portmap status?
> > >
> > > The 'portmap' service has been replaced with 'rpcbind' since RHEL-6.
> > > They have the same functionality, 'rpcbind' just happens to be the
> newer
> > > version.
> > >
> > > Did you file a bug for this already? As Vijay mentions, this crash
> seems
> > > to happen because the Gluster/nfs service fails to initialize correctly
> > > and then fails to cleanup correctly. The cleanup should get fixed, and
> > > we should also give an easier to understand error message.
> > >
> > > Thanks,
> > > Niels
> > >
> > > >
> > > >
> > > > On node1 is ok:
> > > >
> > > > [root at node1 ~]# systemctl status rpcbind.service
> > > > rpcbind.service - RPC bind service
> > > >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)
> > > >    Active: active (running) since p 2014-10-17 19:15:21 CEST; 2 days
> ago
> > > >  Main PID: 1963 (rpcbind)
> > > >    CGroup: /system.slice/rpcbind.service
> > > >            └─1963 /sbin/rpcbind -w
> > > >
> > > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Starting RPC bind
> > > > service...
> > > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Started RPC bind
> service.
> > > >
> > > >
> > > >
> > > > Thanks in advance
> > > >
> > > > Tibor
> > > >
> > > >
> > > >
> > > > ----- Eredeti üzenet -----
> > > > > On 10/19/2014 06:56 PM, Niels de Vos wrote:
> > > > > > On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote:
> > > > > >> Hi,
> > > > > >>
> > > > > >> [root at node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log
> > > > > >> [2014-10-18 07:41:06.136035] E
> [graph.c:307:glusterfs_graph_init]
> > > > > >> 0-nfs-server: initializing translator failed
> > > > > >> [2014-10-18 07:41:06.136040] E
> [graph.c:502:glusterfs_graph_activate]
> > > > > >> 0-graph: init failed
> > > > > >> pending frames:
> > > > > >> frame : type(0) op(0)
> > > > > >>
> > > > > >> patchset: git://git.gluster.com/glusterfs.git
> > > > > >> signal received: 11
> > > > > >> time of crash: 2014-10-18 07:41:06configuration details:
> > > > > >> argp 1
> > > > > >> backtrace 1
> > > > > >> dlfcn 1
> > > > > >> fdatasync 1
> > > > > >> libpthread 1
> > > > > >> llistxattr 1
> > > > > >> setfsid 1
> > > > > >> spinlock 1
> > > > > >> epoll.h 1
> > > > > >> xattr.h 1
> > > > > >> st_atim.tv_nsec 1
> > > > > >> package-string: glusterfs 3.5.2
> > > > > >
> > > > > > This definitely is a gluster/nfs issue. For whatever reasone, the
> > > > > > gluster/nfs server crashes :-/ The log does not show enough
> details,
> > > > > > some more lines before this are needed.
> > > > > >
> > > > >
> > > > > I wonder if the crash is due to a cleanup after the translator
> > > > > initialization failure. The complete logs might help in
> understanding
> > > > > why the initialization failed.
> > > > >
> > > > > -Vijay
> > > > >
> > > > >
> > >
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141023/36684016/attachment.html>