[Gluster-users] NFS not start on localhost
Niels de Vos
ndevos at redhat.com
Sat Nov 8 02:32:11 UTC 2014
On Fri, Nov 07, 2014 at 07:51:47PM -0500, Jason Russler wrote:
> I've run into this as well. After installing hosted-engine for ovirt
> on a gluster volume. The only way to get things working again for me
> was to manually de-register (rpcinfo -d ...) nlockmgr from the
> portmapper and then restart glusterd. Then gluster's NFS successfully
> registers. I don't really get what's going on though.
Is this on RHEL/CentOS 7? A couple of days back someone on IRC had an
issue with this as well. We found out that "rpcbind.service" uses the
"-w" option by default (for warm-restart). Registered services are
written to a cache file, and upon reboot those services get
re-registered automatically, even when not running.
The solution was something like this:
# cp /usr/lib/systemd/system/rpcbind.service /etc/systemd/system/
* edit /etc/systemd/system/rpcbind.service and remove the "-w"
option
# systemctl daemon-reload
# systemctl restart rpcbind.service
# systemctl restart glusterd.service
I am not sure why "-w" was added by default, but it doen not seem to
play nice with Gluster/NFS. Gluster/NFS does not want to break other
registered services, so it bails out when something is registered
already.
HTH,
Niels
>
> ----- Original Message -----
> From: "Sven Achtelik" <Sven.Achtelik at mailpool.us>
> To: gluster-users at gluster.org
> Sent: Friday, November 7, 2014 5:28:32 PM
> Subject: Re: [Gluster-users] NFS not start on localhost
>
>
>
> Hi everyone,
>
>
>
> I’m facing the exact same issue on my installation. Nfs.log entries indicate that something is blocking the gluster nfs from registering with rpcbind.
>
>
>
> [root at ovirt-one ~]# rpcinfo -p
>
> program vers proto port service
>
> 100000 4 tcp 111 portmapper
>
> 100000 3 tcp 111 portmapper
>
> 100000 2 tcp 111 portmapper
>
> 100000 4 udp 111 portmapper
>
> 100000 3 udp 111 portmapper
>
> 100000 2 udp 111 portmapper
>
> 100005 3 tcp 38465 mountd
>
> 100005 1 tcp 38466 mountd
>
> 100003 3 tcp 2049 nfs
>
> 100227 3 tcp 2049 nfs_acl
>
> 100021 3 udp 34343 nlockmgr
>
> 100021 4 udp 34343 nlockmgr
>
> 100021 3 tcp 54017 nlockmgr
>
> 100021 4 tcp 54017 nlockmgr
>
> 100024 1 udp 39097 status
>
> 100024 1 tcp 53471 status
>
> 100021 1 udp 715 nlockmgr
>
>
>
> I’m sure that I’m not using the system NFS Server and I didn’t mount any nfs share.
>
>
>
> @Tibor: Did you solve that issue somehow ?
>
>
>
> Best,
>
>
>
> Sven
>
>
>
>
>
>
> Hi,
> Thank you for you reply.
> I did your recommendations, but there are no changes.
> In the nfs.log there are no new things.
> [ root at node0 glusterfs]# reboot
> Connection to 172.16.0.10 closed by remote host.
> Connection to 172.16.0.10 closed.
> [ tdemeter at sirius-31 ~]$ ssh root at 172.16.0.10
> root at 172.16.0.10 's password:
> Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106
> [ root at node0 ~]# systemctl status nfs.target
> nfs.target - Network File System Server
> Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled)
> Active: inactive (dead)
> [ root at node0 ~]# gluster volume status engine
> Status of volume: engine
> Gluster process Port Online Pid
> ------------------------------------------------------------------------------
> Brick gs00.itsmart.cloud:/gluster/engine0 50160 Y 3271
> Brick gs01.itsmart.cloud:/gluster/engine1 50160 Y 595
> NFS Server on localhost N/A N N/A
> Self-heal Daemon on localhost N/A Y 3286
> NFS Server on gs01.itsmart.cloud 2049 Y 6951
> Self-heal Daemon on gs01.itsmart.cloud N/A Y 6958
> Task Status of Volume engine
> ------------------------------------------------------------------------------
> There are no active volume tasks
> [ root at node0 ~]# systemctl status
> Display all 262 possibilities? (y or n)
> [ root at node0 ~]# systemctl status nfs-lock
> nfs-lock.service - NFS file locking service.
> Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled)
> Active: inactive (dead)
> [ root at node0 ~]# systemctl stop nfs-lock
> [ root at node0 ~]# systemctl restart gluster
> glusterd.service glusterfsd.service gluster.mount
> [ root at node0 ~]# systemctl restart gluster
> glusterd.service glusterfsd.service gluster.mount
> [ root at node0 ~]# systemctl restart glusterfsd.service
> [ root at node0 ~]# systemctl restart glusterd.service
> [ root at node0 ~]# gluster volume status engine
> Status of volume: engine
> Gluster process Port Online Pid
> ------------------------------------------------------------------------------
> Brick gs00.itsmart.cloud:/gluster/engine0 50160 Y 5140
> Brick gs01.itsmart.cloud:/gluster/engine1 50160 Y 2037
> NFS Server on localhost N/A N N/A
> Self-heal Daemon on localhost N/A N N/A
> NFS Server on gs01.itsmart.cloud 2049 Y 6951
> Self-heal Daemon on gs01.itsmart.cloud N/A Y 6958
> Any other idea?
> Tibor
> ----- Eredeti üzenet -----
> > On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote:
> > > Hi,
> > >
> > > This is the full nfs.log after delete & reboot.
> > > It is refers to portmap registering problem.
> > >
> > > [ root at node0 glusterfs]# cat nfs.log
> > > [2014-10-20 06:48:43.221136] I [glusterfsd.c:1959:main]
> > > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2
> > > (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p
> > > /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
> > > /var/run/567e0bba7ad7102eae3049e2ad6c3ed7.socket)
> > > [2014-10-20 06:48:43.224444] I [socket.c:3561:socket_init]
> > > 0-socket.glusterfsd: SSL support is NOT enabled
> > > [2014-10-20 06:48:43.224475] I [socket.c:3576:socket_init]
> > > 0-socket.glusterfsd: using system polling thread
> > > [2014-10-20 06:48:43.224654] I [socket.c:3561:socket_init] 0-glusterfs: SSL
> > > support is NOT enabled
> > > [2014-10-20 06:48:43.224667] I [socket.c:3576:socket_init] 0-glusterfs:
> > > using system polling thread
> > > [2014-10-20 06:48:43.235876] I
> > > [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured
> > > rpc.outstanding-rpc-limit with value 16
> > > [2014-10-20 06:48:43.254087] I [socket.c:3561:socket_init]
> > > 0-socket.nfs-server: SSL support is NOT enabled
> > > [2014-10-20 06:48:43.254116] I [socket.c:3576:socket_init]
> > > 0-socket.nfs-server: using system polling thread
> > > [2014-10-20 06:48:43.255241] I [socket.c:3561:socket_init]
> > > 0-socket.nfs-server: SSL support is NOT enabled
> > > [2014-10-20 06:48:43.255264] I [socket.c:3576:socket_init]
> > > 0-socket.nfs-server: using system polling thread
> > > [2014-10-20 06:48:43.257279] I [socket.c:3561:socket_init]
> > > 0-socket.nfs-server: SSL support is NOT enabled
> > > [2014-10-20 06:48:43.257315] I [socket.c:3576:socket_init]
> > > 0-socket.nfs-server: using system polling thread
> > > [2014-10-20 06:48:43.258135] I [socket.c:3561:socket_init] 0-socket.NLM:
> > > SSL support is NOT enabled
> > > [2014-10-20 06:48:43.258157] I [socket.c:3576:socket_init] 0-socket.NLM:
> > > using system polling thread
> > > [2014-10-20 06:48:43.293724] E
> > > [rpcsvc.c:1314:rpcsvc_program_register_portmap] 0-rpc-service: Could not
> > > register with portmap
> > > [2014-10-20 06:48:43.293760] E [nfs.c:332:nfs_init_versions] 0-nfs: Program
> > > NLM4 registration failed
> >
> > The above line suggests that there already is a service registered at
> > portmapper for the NLM4 program/service. This happens when the kernel
> > module 'lockd' is loaded. The kernel NFS-client and NFS-server depend on
> > this, but unfortunately it conflicts with the Gluster/nfs server.
> >
> > Could you verify that the module is loaded?
> > - use 'lsmod | grep lockd' to check the modules
> > - use 'rpcinfo | grep nlockmgr' to check the rpcbind registrations
> >
> > Make sure that you do not mount any NFS exports on the Gluster server.
> > Unmount all NFS mounts.
> >
> > You mentioned you are running CentOS-7, which is systemd based. You
> > should be able to stop any conflicting NFS services like this:
> >
> > # systemctl stop nfs-lock.service
> > # systemctl stop nfs.target
> > # systemctl disable nfs.target
> >
> > If all these services cleanup themselves, you should be able to start
> > the Gluster/nfs service:
> >
> > # systemctl restart glusterd.service
> >
> > In case some bits are still lingering around, it might be easier to
> > reboot after disabling the 'nfs.target'.
> >
> > > [2014-10-20 06:48:43.293771] E [nfs.c:1312:init] 0-nfs: Failed to
> > > initialize protocols
> > > [2014-10-20 06:48:43.293777] E [xlator.c:403:xlator_init] 0-nfs-server:
> > > Initialization of volume 'nfs-server' failed, review your volfile again
> > > [2014-10-20 06:48:43.293783] E [graph.c:307:glusterfs_graph_init]
> > > 0-nfs-server: initializing translator failed
> > > [2014-10-20 06:48:43.293789] E [graph.c:502:glusterfs_graph_activate]
> > > 0-graph: init failed
> > > pending frames:
> > > frame : type(0) op(0)
> > >
> > > patchset: git://git.gluster.com/glusterfs.git
> > > signal received: 11
> > > time of crash: 2014-10-20 06:48:43configuration details:
> > > argp 1
> > > backtrace 1
> > > dlfcn 1
> > > fdatasync 1
> > > libpthread 1
> > > llistxattr 1
> > > setfsid 1
> > > spinlock 1
> > > epoll.h 1
> > > xattr.h 1
> > > st_atim.tv_nsec 1
> > > package-string: glusterfs 3.5.2
> > > [ root at node0 glusterfs]# systemctl status portma
> > > portma.service
> > > Loaded: not-found (Reason: No such file or directory)
> > > Active: inactive (dead)
> > >
> > >
> > >
> > > Also I have checked the rpcbind service.
> > >
> > > [ root at node0 glusterfs]# systemctl status rpcbind.service
> > > rpcbind.service - RPC bind service
> > > Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)
> > > Active: active (running) since h 2014-10-20 08:48:39 CEST; 2min 52s ago
> > > Process: 1940 ExecStart=/sbin/rpcbind -w ${RPCBIND_ARGS} (code=exited,
> > > status=0/SUCCESS)
> > > Main PID: 1946 (rpcbind)
> > > CGroup: /system.slice/rpcbind.service
> > > └─1946 /sbin/rpcbind -w
> > >
> > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Starting RPC bind
> > > service...
> > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Started RPC bind service.
> > >
> > > The restart does not solve this problem.
> > >
> > >
> > > I think this is the problem. Why are "exited" the portmap status?
> >
> > The 'portmap' service has been replaced with 'rpcbind' since RHEL-6.
> > They have the same functionality, 'rpcbind' just happens to be the newer
> > version.
> >
> > Did you file a bug for this already? As Vijay mentions, this crash seems
> > to happen because the Gluster/nfs service fails to initialize correctly
> > and then fails to cleanup correctly. The cleanup should get fixed, and
> > we should also give an easier to understand error message.
> >
> > Thanks,
> > Niels
> >
> > >
> > >
> > > On node1 is ok:
> > >
> > > [ root at node1 ~]# systemctl status rpcbind.service
> > > rpcbind.service - RPC bind service
> > > Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)
> > > Active: active (running) since p 2014-10-17 19:15:21 CEST; 2 days ago
> > > Main PID: 1963 (rpcbind)
> > > CGroup: /system.slice/rpcbind.service
> > > └─1963 /sbin/rpcbind -w
> > >
> > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Starting RPC bind
> > > service...
> > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Started RPC bind service.
> > >
> > >
> > >
> > > Thanks in advance
> > >
> > > Tibor
> > >
> > >
> > >
> > > ----- Eredeti üzenet -----
> > > > On 10/19/2014 06:56 PM, Niels de Vos wrote:
> > > > > On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote:
> > > > >> Hi,
> > > > >>
> > > > >> [ root at node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log
> > > > >> [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init]
> > > > >> 0-nfs-server: initializing translator failed
> > > > >> [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate]
> > > > >> 0-graph: init failed
> > > > >> pending frames:
> > > > >> frame : type(0) op(0)
> > > > >>
> > > > >> patchset: git://git.gluster.com/glusterfs.git
> > > > >> signal received: 11
> > > > >> time of crash: 2014-10-18 07:41:06configuration details:
> > > > >> argp 1
> > > > >> backtrace 1
> > > > >> dlfcn 1
> > > > >> fdatasync 1
> > > > >> libpthread 1
> > > > >> llistxattr 1
> > > > >> setfsid 1
> > > > >> spinlock 1
> > > > >> epoll.h 1
> > > > >> xattr.h 1
> > > > >> st_atim.tv_nsec 1
> > > > >> package-string: glusterfs 3.5.2
> > > > >
> > > > > This definitely is a gluster/nfs issue. For whatever reasone, the
> > > > > gluster/nfs server crashes :-/ The log does not show enough details,
> > > > > some more lines before this are needed.
> > > > >
> > > >
> > > > I wonder if the crash is due to a cleanup after the translator
> > > > initialization failure. The complete logs might help in understanding
> > > > why the initialization failed.
> > > >
> > > > -Vijay
> > > >
> > > >
> >
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141108/ea38806f/attachment.sig>
More information about the Gluster-users
mailing list