[Gluster-users] NFS not start on localhost

Sat Nov 8 02:32:11 UTC 2014

On Fri, Nov 07, 2014 at 07:51:47PM -0500, Jason Russler wrote:
> I've run into this as well. After installing hosted-engine for ovirt
> on a gluster volume. The only way to get things working again for me
> was to manually de-register (rpcinfo -d ...) nlockmgr from the
> portmapper and then restart glusterd. Then gluster's NFS successfully
> registers. I don't really get what's going on though.

Is this on RHEL/CentOS 7? A couple of days back someone on IRC had an
issue with this as well. We found out that "rpcbind.service" uses the
"-w" option by default (for warm-restart). Registered services are
written to a cache file, and upon reboot those services get
re-registered automatically, even when not running.

The solution was something like this:

    # cp /usr/lib/systemd/system/rpcbind.service /etc/systemd/system/
    * edit /etc/systemd/system/rpcbind.service and remove the "-w"
      option
    # systemctl daemon-reload
    # systemctl restart rpcbind.service
    # systemctl restart glusterd.service

I am not sure why "-w" was added by default, but it doen not seem to
play nice with Gluster/NFS. Gluster/NFS does not want to break other
registered services, so it bails out when something is registered
already.

HTH,
Niels

> 
> ----- Original Message -----
> From: "Sven Achtelik" <Sven.Achtelik at mailpool.us>
> To: gluster-users at gluster.org
> Sent: Friday, November 7, 2014 5:28:32 PM
> Subject: Re: [Gluster-users] NFS not start on localhost
> 
> 
> 
> Hi everyone, 
> 
> 
> 
> I’m facing the exact same issue on my installation. Nfs.log entries indicate that something is blocking the gluster nfs from registering with rpcbind. 
> 
> 
> 
> [root at ovirt-one ~]# rpcinfo -p 
> 
> program vers proto port service 
> 
> 100000 4 tcp 111 portmapper 
> 
> 100000 3 tcp 111 portmapper 
> 
> 100000 2 tcp 111 portmapper 
> 
> 100000 4 udp 111 portmapper 
> 
> 100000 3 udp 111 portmapper 
> 
> 100000 2 udp 111 portmapper 
> 
> 100005 3 tcp 38465 mountd 
> 
> 100005 1 tcp 38466 mountd 
> 
> 100003 3 tcp 2049 nfs 
> 
> 100227 3 tcp 2049 nfs_acl 
> 
> 100021 3 udp 34343 nlockmgr 
> 
> 100021 4 udp 34343 nlockmgr 
> 
> 100021 3 tcp 54017 nlockmgr 
> 
> 100021 4 tcp 54017 nlockmgr 
> 
> 100024 1 udp 39097 status 
> 
> 100024 1 tcp 53471 status 
> 
> 100021 1 udp 715 nlockmgr 
> 
> 
> 
> I’m sure that I’m not using the system NFS Server and I didn’t mount any nfs share. 
> 
> 
> 
> @Tibor: Did you solve that issue somehow ? 
> 
> 
> 
> Best, 
> 
> 
> 
> Sven 
> 
> 
> 
> 
> 
> 
> Hi, 
> Thank you for you reply. 
> I did your recommendations, but there are no changes. 
> In the nfs.log there are no new things. 
> [ root at node0 glusterfs]# reboot 
> Connection to 172.16.0.10 closed by remote host. 
> Connection to 172.16.0.10 closed. 
> [ tdemeter at sirius-31 ~]$ ssh root at 172.16.0.10 
> root at 172.16.0.10 's password: 
> Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106 
> [ root at node0 ~]# systemctl status nfs.target 
> nfs.target - Network File System Server 
> Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled) 
> Active: inactive (dead) 
> [ root at node0 ~]# gluster volume status engine 
> Status of volume: engine 
> Gluster process                                      Port    Online        Pid 
> ------------------------------------------------------------------------------ 
> Brick gs00.itsmart.cloud:/gluster/engine0            50160   Y        3271 
> Brick gs01.itsmart.cloud:/gluster/engine1            50160   Y        595 
> NFS Server on localhost                                     N/A     N       N/A 
> Self-heal Daemon on localhost                        N/A     Y        3286 
> NFS Server on gs01.itsmart.cloud                     2049    Y        6951 
> Self-heal Daemon on gs01.itsmart.cloud               N/A     Y        6958 
> Task Status of Volume engine 
> ------------------------------------------------------------------------------ 
> There are no active volume tasks 
> [ root at node0 ~]# systemctl status 
> Display all 262 possibilities? (y or n) 
> [ root at node0 ~]# systemctl status nfs-lock 
> nfs-lock.service - NFS file locking service. 
> Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled) 
> Active: inactive (dead) 
> [ root at node0 ~]# systemctl stop nfs-lock 
> [ root at node0 ~]# systemctl restart gluster 
> glusterd.service    glusterfsd.service  gluster.mount 
> [ root at node0 ~]# systemctl restart gluster 
> glusterd.service    glusterfsd.service  gluster.mount 
> [ root at node0 ~]# systemctl restart glusterfsd.service 
> [ root at node0 ~]# systemctl restart glusterd.service 
> [ root at node0 ~]# gluster volume status engine 
> Status of volume: engine 
> Gluster process                                      Port    Online        Pid 
> ------------------------------------------------------------------------------ 
> Brick gs00.itsmart.cloud:/gluster/engine0            50160   Y        5140 
> Brick gs01.itsmart.cloud:/gluster/engine1            50160   Y        2037 
> NFS Server on localhost                                     N/A     N       N/A 
> Self-heal Daemon on localhost                        N/A     N        N/A 
> NFS Server on gs01.itsmart.cloud                     2049    Y        6951 
> Self-heal Daemon on gs01.itsmart.cloud               N/A     Y        6958 
> Any other idea? 
> Tibor 
> ----- Eredeti üzenet ----- 
> > On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote: 
> > > Hi, 
> > > 
> > > This is the full nfs.log after delete & reboot. 
> > > It is refers to portmap registering problem. 
> > > 
> > > [ root at node0 glusterfs]# cat nfs.log 
> > > [2014-10-20 06:48:43.221136] I [glusterfsd.c:1959:main] 
> > > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 
> > > (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p 
> > > /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S 
> > > /var/run/567e0bba7ad7102eae3049e2ad6c3ed7.socket) 
> > > [2014-10-20 06:48:43.224444] I [socket.c:3561:socket_init] 
> > > 0-socket.glusterfsd: SSL support is NOT enabled 
> > > [2014-10-20 06:48:43.224475] I [socket.c:3576:socket_init] 
> > > 0-socket.glusterfsd: using system polling thread 
> > > [2014-10-20 06:48:43.224654] I [socket.c:3561:socket_init] 0-glusterfs: SSL 
> > > support is NOT enabled 
> > > [2014-10-20 06:48:43.224667] I [socket.c:3576:socket_init] 0-glusterfs: 
> > > using system polling thread 
> > > [2014-10-20 06:48:43.235876] I 
> > > [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured 
> > > rpc.outstanding-rpc-limit with value 16 
> > > [2014-10-20 06:48:43.254087] I [socket.c:3561:socket_init] 
> > > 0-socket.nfs-server: SSL support is NOT enabled 
> > > [2014-10-20 06:48:43.254116] I [socket.c:3576:socket_init] 
> > > 0-socket.nfs-server: using system polling thread 
> > > [2014-10-20 06:48:43.255241] I [socket.c:3561:socket_init] 
> > > 0-socket.nfs-server: SSL support is NOT enabled 
> > > [2014-10-20 06:48:43.255264] I [socket.c:3576:socket_init] 
> > > 0-socket.nfs-server: using system polling thread 
> > > [2014-10-20 06:48:43.257279] I [socket.c:3561:socket_init] 
> > > 0-socket.nfs-server: SSL support is NOT enabled 
> > > [2014-10-20 06:48:43.257315] I [socket.c:3576:socket_init] 
> > > 0-socket.nfs-server: using system polling thread 
> > > [2014-10-20 06:48:43.258135] I [socket.c:3561:socket_init] 0-socket.NLM: 
> > > SSL support is NOT enabled 
> > > [2014-10-20 06:48:43.258157] I [socket.c:3576:socket_init] 0-socket.NLM: 
> > > using system polling thread 
> > > [2014-10-20 06:48:43.293724] E 
> > > [rpcsvc.c:1314:rpcsvc_program_register_portmap] 0-rpc-service: Could not 
> > > register with portmap 
> > > [2014-10-20 06:48:43.293760] E [nfs.c:332:nfs_init_versions] 0-nfs: Program 
> > > NLM4 registration failed 
> > 
> > The above line suggests that there already is a service registered at 
> > portmapper for the NLM4 program/service. This happens when the kernel 
> > module 'lockd' is loaded. The kernel NFS-client and NFS-server depend on 
> > this, but unfortunately it conflicts with the Gluster/nfs server. 
> > 
> > Could you verify that the module is loaded? 
> > - use 'lsmod | grep lockd' to check the modules 
> > - use 'rpcinfo | grep nlockmgr' to check the rpcbind registrations 
> > 
> > Make sure that you do not mount any NFS exports on the Gluster server. 
> > Unmount all NFS mounts. 
> > 
> > You mentioned you are running CentOS-7, which is systemd based. You 
> > should be able to stop any conflicting NFS services like this: 
> > 
> > # systemctl stop nfs-lock.service 
> > # systemctl stop nfs.target 
> > # systemctl disable nfs.target 
> > 
> > If all these services cleanup themselves, you should be able to start 
> > the Gluster/nfs service: 
> > 
> > # systemctl restart glusterd.service 
> > 
> > In case some bits are still lingering around, it might be easier to 
> > reboot after disabling the 'nfs.target'. 
> > 
> > > [2014-10-20 06:48:43.293771] E [nfs.c:1312:init] 0-nfs: Failed to 
> > > initialize protocols 
> > > [2014-10-20 06:48:43.293777] E [xlator.c:403:xlator_init] 0-nfs-server: 
> > > Initialization of volume 'nfs-server' failed, review your volfile again 
> > > [2014-10-20 06:48:43.293783] E [graph.c:307:glusterfs_graph_init] 
> > > 0-nfs-server: initializing translator failed 
> > > [2014-10-20 06:48:43.293789] E [graph.c:502:glusterfs_graph_activate] 
> > > 0-graph: init failed 
> > > pending frames: 
> > > frame : type(0) op(0) 
> > > 
> > > patchset: git://git.gluster.com/glusterfs.git 
> > > signal received: 11 
> > > time of crash: 2014-10-20 06:48:43configuration details: 
> > > argp 1 
> > > backtrace 1 
> > > dlfcn 1 
> > > fdatasync 1 
> > > libpthread 1 
> > > llistxattr 1 
> > > setfsid 1 
> > > spinlock 1 
> > > epoll.h 1 
> > > xattr.h 1 
> > > st_atim.tv_nsec 1 
> > > package-string: glusterfs 3.5.2 
> > > [ root at node0 glusterfs]# systemctl status portma 
> > > portma.service 
> > >    Loaded: not-found (Reason: No such file or directory) 
> > >    Active: inactive (dead) 
> > > 
> > > 
> > > 
> > > Also I have checked the rpcbind service. 
> > > 
> > > [ root at node0 glusterfs]# systemctl status rpcbind.service 
> > > rpcbind.service - RPC bind service 
> > >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled) 
> > >    Active: active (running) since h 2014-10-20 08:48:39 CEST; 2min 52s ago 
> > >   Process: 1940 ExecStart=/sbin/rpcbind -w ${RPCBIND_ARGS} (code=exited, 
> > >   status=0/SUCCESS) 
> > >  Main PID: 1946 (rpcbind) 
> > >    CGroup: /system.slice/rpcbind.service 
> > >            └─1946 /sbin/rpcbind -w 
> > > 
> > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Starting RPC bind 
> > > service... 
> > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Started RPC bind service. 
> > > 
> > > The restart does not solve this problem. 
> > > 
> > > 
> > > I think this is the problem. Why are "exited" the portmap status? 
> > 
> > The 'portmap' service has been replaced with 'rpcbind' since RHEL-6. 
> > They have the same functionality, 'rpcbind' just happens to be the newer 
> > version. 
> > 
> > Did you file a bug for this already? As Vijay mentions, this crash seems 
> > to happen because the Gluster/nfs service fails to initialize correctly 
> > and then fails to cleanup correctly. The cleanup should get fixed, and 
> > we should also give an easier to understand error message. 
> > 
> > Thanks, 
> > Niels 
> > 
> > > 
> > > 
> > > On node1 is ok: 
> > > 
> > > [ root at node1 ~]# systemctl status rpcbind.service 
> > > rpcbind.service - RPC bind service 
> > >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled) 
> > >    Active: active (running) since p 2014-10-17 19:15:21 CEST; 2 days ago 
> > >  Main PID: 1963 (rpcbind) 
> > >    CGroup: /system.slice/rpcbind.service 
> > >            └─1963 /sbin/rpcbind -w 
> > > 
> > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Starting RPC bind 
> > > service... 
> > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Started RPC bind service. 
> > > 
> > > 
> > > 
> > > Thanks in advance 
> > > 
> > > Tibor 
> > > 
> > > 
> > > 
> > > ----- Eredeti üzenet ----- 
> > > > On 10/19/2014 06:56 PM, Niels de Vos wrote: 
> > > > > On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote: 
> > > > >> Hi, 
> > > > >> 
> > > > >> [ root at node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log 
> > > > >> [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init] 
> > > > >> 0-nfs-server: initializing translator failed 
> > > > >> [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate] 
> > > > >> 0-graph: init failed 
> > > > >> pending frames: 
> > > > >> frame : type(0) op(0) 
> > > > >> 
> > > > >> patchset: git://git.gluster.com/glusterfs.git 
> > > > >> signal received: 11 
> > > > >> time of crash: 2014-10-18 07:41:06configuration details: 
> > > > >> argp 1 
> > > > >> backtrace 1 
> > > > >> dlfcn 1 
> > > > >> fdatasync 1 
> > > > >> libpthread 1 
> > > > >> llistxattr 1 
> > > > >> setfsid 1 
> > > > >> spinlock 1 
> > > > >> epoll.h 1 
> > > > >> xattr.h 1 
> > > > >> st_atim.tv_nsec 1 
> > > > >> package-string: glusterfs 3.5.2 
> > > > > 
> > > > > This definitely is a gluster/nfs issue. For whatever reasone, the 
> > > > > gluster/nfs server crashes :-/ The log does not show enough details, 
> > > > > some more lines before this are needed. 
> > > > > 
> > > > 
> > > > I wonder if the crash is due to a cleanup after the translator 
> > > > initialization failure. The complete logs might help in understanding 
> > > > why the initialization failed. 
> > > > 
> > > > -Vijay 
> > > > 
> > > > 
> > 
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141108/ea38806f/attachment.sig>