[Gluster-users] NFS not start on localhost

Mon Oct 20 10:53:46 UTC 2014

Hi,

Thank you for you reply.

I did your recommendations, but there are no changes.

In the nfs.log there are no new things.

[root at node0 glusterfs]# reboot
Connection to 172.16.0.10 closed by remote host.
Connection to 172.16.0.10 closed.
[tdemeter at sirius-31 ~]$ ssh root at 172.16.0.10
root at 172.16.0.10's password: 
Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106
[root at node0 ~]# systemctl status nfs.target 
nfs.target - Network File System Server
   Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled)
   Active: inactive (dead)

[root at node0 ~]# gluster volume status engine
Status of volume: engine
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick gs00.itsmart.cloud:/gluster/engine0		50160	Y	3271
Brick gs01.itsmart.cloud:/gluster/engine1		50160	Y	595
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	Y	3286
NFS Server on gs01.itsmart.cloud			2049	Y	6951
Self-heal Daemon on gs01.itsmart.cloud			N/A	Y	6958

Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks

[root at node0 ~]# systemctl status 
Display all 262 possibilities? (y or n)
[root at node0 ~]# systemctl status nfs-lock
nfs-lock.service - NFS file locking service.
   Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled)
   Active: inactive (dead)

[root at node0 ~]# systemctl stop nfs-lock
[root at node0 ~]# systemctl restart gluster
glusterd.service    glusterfsd.service  gluster.mount       
[root at node0 ~]# systemctl restart gluster
glusterd.service    glusterfsd.service  gluster.mount       
[root at node0 ~]# systemctl restart glusterfsd.service 
[root at node0 ~]# systemctl restart glusterd.service 
[root at node0 ~]# gluster volume status engine
Status of volume: engine
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick gs00.itsmart.cloud:/gluster/engine0		50160	Y	5140
Brick gs01.itsmart.cloud:/gluster/engine1		50160	Y	2037
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	N/A
NFS Server on gs01.itsmart.cloud			2049	Y	6951
Self-heal Daemon on gs01.itsmart.cloud			N/A	Y	6958

Any other idea?

Tibor

----- Eredeti üzenet -----
> On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote:
> > Hi,
> > 
> > This is the full nfs.log after delete & reboot.
> > It is refers to portmap registering problem.
> > 
> > [root at node0 glusterfs]# cat nfs.log
> > [2014-10-20 06:48:43.221136] I [glusterfsd.c:1959:main]
> > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2
> > (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p
> > /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
> > /var/run/567e0bba7ad7102eae3049e2ad6c3ed7.socket)
> > [2014-10-20 06:48:43.224444] I [socket.c:3561:socket_init]
> > 0-socket.glusterfsd: SSL support is NOT enabled
> > [2014-10-20 06:48:43.224475] I [socket.c:3576:socket_init]
> > 0-socket.glusterfsd: using system polling thread
> > [2014-10-20 06:48:43.224654] I [socket.c:3561:socket_init] 0-glusterfs: SSL
> > support is NOT enabled
> > [2014-10-20 06:48:43.224667] I [socket.c:3576:socket_init] 0-glusterfs:
> > using system polling thread
> > [2014-10-20 06:48:43.235876] I
> > [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured
> > rpc.outstanding-rpc-limit with value 16
> > [2014-10-20 06:48:43.254087] I [socket.c:3561:socket_init]
> > 0-socket.nfs-server: SSL support is NOT enabled
> > [2014-10-20 06:48:43.254116] I [socket.c:3576:socket_init]
> > 0-socket.nfs-server: using system polling thread
> > [2014-10-20 06:48:43.255241] I [socket.c:3561:socket_init]
> > 0-socket.nfs-server: SSL support is NOT enabled
> > [2014-10-20 06:48:43.255264] I [socket.c:3576:socket_init]
> > 0-socket.nfs-server: using system polling thread
> > [2014-10-20 06:48:43.257279] I [socket.c:3561:socket_init]
> > 0-socket.nfs-server: SSL support is NOT enabled
> > [2014-10-20 06:48:43.257315] I [socket.c:3576:socket_init]
> > 0-socket.nfs-server: using system polling thread
> > [2014-10-20 06:48:43.258135] I [socket.c:3561:socket_init] 0-socket.NLM:
> > SSL support is NOT enabled
> > [2014-10-20 06:48:43.258157] I [socket.c:3576:socket_init] 0-socket.NLM:
> > using system polling thread
> > [2014-10-20 06:48:43.293724] E
> > [rpcsvc.c:1314:rpcsvc_program_register_portmap] 0-rpc-service: Could not
> > register with portmap
> > [2014-10-20 06:48:43.293760] E [nfs.c:332:nfs_init_versions] 0-nfs: Program
> > NLM4 registration failed
> 
> The above line suggests that there already is a service registered at
> portmapper for the NLM4 program/service. This happens when the kernel
> module 'lockd' is loaded. The kernel NFS-client and NFS-server depend on
> this, but unfortunately it conflicts with the Gluster/nfs server.
> 
> Could you verify that the module is loaded?
>  - use 'lsmod | grep lockd' to check the modules
>  - use 'rpcinfo | grep nlockmgr' to check the rpcbind registrations
> 
> Make sure that you do not mount any NFS exports on the Gluster server.
> Unmount all NFS mounts.
> 
> You mentioned you are running CentOS-7, which is systemd based. You
> should be able to stop any conflicting NFS services like this:
> 
>  # systemctl stop nfs-lock.service
>  # systemctl stop nfs.target
>  # systemctl disable nfs.target
> 
> If all these services cleanup themselves, you should be able to start
> the Gluster/nfs service:
> 
>   # systemctl restart glusterd.service
> 
> In case some bits are still lingering around, it might be easier to
> reboot after disabling the 'nfs.target'.
> 
> > [2014-10-20 06:48:43.293771] E [nfs.c:1312:init] 0-nfs: Failed to
> > initialize protocols
> > [2014-10-20 06:48:43.293777] E [xlator.c:403:xlator_init] 0-nfs-server:
> > Initialization of volume 'nfs-server' failed, review your volfile again
> > [2014-10-20 06:48:43.293783] E [graph.c:307:glusterfs_graph_init]
> > 0-nfs-server: initializing translator failed
> > [2014-10-20 06:48:43.293789] E [graph.c:502:glusterfs_graph_activate]
> > 0-graph: init failed
> > pending frames:
> > frame : type(0) op(0)
> > 
> > patchset: git://git.gluster.com/glusterfs.git
> > signal received: 11
> > time of crash: 2014-10-20 06:48:43configuration details:
> > argp 1
> > backtrace 1
> > dlfcn 1
> > fdatasync 1
> > libpthread 1
> > llistxattr 1
> > setfsid 1
> > spinlock 1
> > epoll.h 1
> > xattr.h 1
> > st_atim.tv_nsec 1
> > package-string: glusterfs 3.5.2
> > [root at node0 glusterfs]# systemctl status portma
> > portma.service
> >    Loaded: not-found (Reason: No such file or directory)
> >    Active: inactive (dead)
> > 
> > 
> > 
> > Also I have checked the rpcbind service.
> > 
> > [root at node0 glusterfs]# systemctl status rpcbind.service
> > rpcbind.service - RPC bind service
> >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)
> >    Active: active (running) since h 2014-10-20 08:48:39 CEST; 2min 52s ago
> >   Process: 1940 ExecStart=/sbin/rpcbind -w ${RPCBIND_ARGS} (code=exited,
> >   status=0/SUCCESS)
> >  Main PID: 1946 (rpcbind)
> >    CGroup: /system.slice/rpcbind.service
> >            └─1946 /sbin/rpcbind -w
> > 
> > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Starting RPC bind
> > service...
> > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Started RPC bind service.
> > 
> > The restart does not solve this problem.
> > 
> > 
> > I think this is the problem. Why are "exited" the portmap status?
> 
> The 'portmap' service has been replaced with 'rpcbind' since RHEL-6.
> They have the same functionality, 'rpcbind' just happens to be the newer
> version.
> 
> Did you file a bug for this already? As Vijay mentions, this crash seems
> to happen because the Gluster/nfs service fails to initialize correctly
> and then fails to cleanup correctly. The cleanup should get fixed, and
> we should also give an easier to understand error message.
> 
> Thanks,
> Niels
> 
> > 
> > 
> > On node1 is ok:
> > 
> > [root at node1 ~]# systemctl status rpcbind.service
> > rpcbind.service - RPC bind service
> >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)
> >    Active: active (running) since p 2014-10-17 19:15:21 CEST; 2 days ago
> >  Main PID: 1963 (rpcbind)
> >    CGroup: /system.slice/rpcbind.service
> >            └─1963 /sbin/rpcbind -w
> > 
> > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Starting RPC bind
> > service...
> > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Started RPC bind service.
> > 
> > 
> > 
> > Thanks in advance
> > 
> > Tibor
> > 
> > 
> > 
> > ----- Eredeti üzenet -----
> > > On 10/19/2014 06:56 PM, Niels de Vos wrote:
> > > > On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote:
> > > >> Hi,
> > > >>
> > > >> [root at node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log
> > > >> [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init]
> > > >> 0-nfs-server: initializing translator failed
> > > >> [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate]
> > > >> 0-graph: init failed
> > > >> pending frames:
> > > >> frame : type(0) op(0)
> > > >>
> > > >> patchset: git://git.gluster.com/glusterfs.git
> > > >> signal received: 11
> > > >> time of crash: 2014-10-18 07:41:06configuration details:
> > > >> argp 1
> > > >> backtrace 1
> > > >> dlfcn 1
> > > >> fdatasync 1
> > > >> libpthread 1
> > > >> llistxattr 1
> > > >> setfsid 1
> > > >> spinlock 1
> > > >> epoll.h 1
> > > >> xattr.h 1
> > > >> st_atim.tv_nsec 1
> > > >> package-string: glusterfs 3.5.2
> > > >
> > > > This definitely is a gluster/nfs issue. For whatever reasone, the
> > > > gluster/nfs server crashes :-/ The log does not show enough details,
> > > > some more lines before this are needed.
> > > >
> > > 
> > > I wonder if the crash is due to a cleanup after the translator
> > > initialization failure. The complete logs might help in understanding
> > > why the initialization failed.
> > > 
> > > -Vijay
> > > 
> > > 
>