[Gluster-users] [Gluster-devel] v3.6.2
David F. Robinson
david.robinson at corvidtec.com
Tue Jan 27 15:49:41 UTC 2015
I rebooted the machine to see if the problem would return and it does.
Same issue after a reboot.
Any suggestions?
One other thing I tested was to comment out the NFS mounts in
/etc/fstab:
# gfsib01bkp.corvidtec.com:/homegfs_bkp /backup_nfs/homegfs nfs
vers=3,intr,bg,rsize=32768,wsize=32768 0 0
After the machine comes back up, I remove the comment and do a 'mount
-a'. The mount works fine.
It looks like it is a timing during startup issue. Is it trying to do
the NFS mount while glusterd is still starting up?
David
------ Original Message ------
From: "Xavier Hernandez" <xhernandez at datalab.es>
To: "David F. Robinson" <david.robinson at corvidtec.com>; "Kaushal M"
<kshlmster at gmail.com>
Cc: "Gluster Users" <gluster-users at gluster.org>; "Gluster Devel"
<gluster-devel at gluster.org>
Sent: 1/27/2015 10:02:31 AM
Subject: Re: [Gluster-devel] [Gluster-users] v3.6.2
>Hi,
>
>I had a similar problem once. It happened after doing some unrelated
>tests with NFS. I thought it was a problem I generated doing weird
>things, so I didn't investigate the cause further.
>
>To see if this is the same case, try this:
>
>* Unmount all NFS mounts and stop all gluster volumes
>* Check that there are no gluster processes running (ps ax | grep
>gluster), specially any glusterfs. glusterd is ok.
>* Check that there are no NFS processes running (ps ax | grep nfs)
>* Check with 'rpcinfo -p' that there's no nfs service registered
>
>The output should be similar to this:
>
> program vers proto port service
> 100000 4 tcp 111 portmapper
> 100000 3 tcp 111 portmapper
> 100000 2 tcp 111 portmapper
> 100000 4 udp 111 portmapper
> 100000 3 udp 111 portmapper
> 100000 2 udp 111 portmapper
> 100024 1 udp 33482 status
> 100024 1 tcp 37034 status
>
>If there are more services registered, you can directly delete them or
>check if they correspond to an active process. For example, if the
>output is this:
>
> program vers proto port service
> 100000 4 tcp 111 portmapper
> 100000 3 tcp 111 portmapper
> 100000 2 tcp 111 portmapper
> 100000 4 udp 111 portmapper
> 100000 3 udp 111 portmapper
> 100000 2 udp 111 portmapper
> 100021 3 udp 39618 nlockmgr
> 100021 3 tcp 41067 nlockmgr
> 100024 1 udp 33482 status
> 100024 1 tcp 37034 status
>
>You can do a "netstat -anp | grep 39618" to see if there is some
>process really listening at the nlockmgr port. You can repeat this for
>port 41067. If there is some process, you should stop it. If there is
>no process listening on that port, you should remove it with a command
>like this:
>
> rpcinfo -d 100021 3
>
>You must execute this command for all stale ports for any services
>other than portmapper and status. Once done you should get the output
>shown before.
>
>After that, you can try to start your volume and see if everything is
>registered (rpcinfo -p) and if gluster has started the nfs server
>(gluster volume status).
>
>If everything is ok, you should be able to mount the volume using NFS.
>
>Xavi
>
>On 01/27/2015 03:18 PM, David F. Robinson wrote:
>>Turning off nfslock did not help. Also, still getting these messages
>>every 3-seconds:
>>
>>[2015-01-27 14:16:12.921880] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:15.922431] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:18.923080] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:21.923748] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:24.924472] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:27.925192] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:30.925895] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:33.926563] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>[2015-01-27 14:16:36.927248] W [socket.c:611:__socket_rwv]
>>0-management:
>>readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed
>>(Invalid argument)
>>------ Original Message ------
>>From: "Kaushal M" <kshlmster at gmail.com <mailto:kshlmster at gmail.com>>
>>To: "David F. Robinson" <david.robinson at corvidtec.com
>><mailto:david.robinson at corvidtec.com>>
>>Cc: "Joe Julian" <joe at julianfamily.org <mailto:joe at julianfamily.org>>;
>>"Gluster Users" <gluster-users at gluster.org
>><mailto:gluster-users at gluster.org>>; "Gluster Devel"
>><gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
>>Sent: 1/27/2015 1:49:56 AM
>>Subject: Re: Re[2]: [Gluster-devel] [Gluster-users] v3.6.2
>>>Your nfs.log file has the following lines,
>>>```
>>>[2015-01-26 20:06:58.298078] E
>>>[rpcsvc.c:1303:rpcsvc_program_register_portmap] 0-rpc-service: Could
>>>not register with portmap 100021 4 38468
>>>[2015-01-26 20:06:58.298108] E [nfs.c:331:nfs_init_versions] 0-nfs:
>>>Program NLM4 registration failed
>>>```
>>>
>>>The Gluster NFS server has it's own NLM (nlockmgr) implementation.
>>>You
>>>said that you have the nfslock service on. Can you turn that off and
>>>try again?
>>>
>>>~kaushal
>>>
>>>On Tue, Jan 27, 2015 at 11:21 AM, David F. Robinson
>>><david.robinson at corvidtec.com <mailto:david.robinson at corvidtec.com>>
>>>wrote:
>>>
>>> On a different system where gluster-NFS (not kernel-nfs) is
>>> working properly shows the following:
>>> [root at gfs01a glusterfs]# rpcinfo -p
>>> program vers proto port service
>>> 100000 4 tcp 111 portmapper
>>> 100000 3 tcp 111 portmapper
>>> 100000 2 tcp 111 portmapper
>>> 100000 4 udp 111 portmapper
>>> 100000 3 udp 111 portmapper
>>> 100000 2 udp 111 portmapper
>>> 100005 3 tcp 38465 mountd
>>> 100005 1 tcp 38466 mountd
>>> 100003 3 tcp 2049 nfs
>>> 100024 1 udp 42413 status
>>> 100024 1 tcp 35424 status
>>> 100021 4 tcp 38468 nlockmgr
>>> 100021 1 udp 801 nlockmgr
>>> 100227 3 tcp 2049 nfs_acl
>>> 100021 1 tcp 804 nlockmgr
>>>
>>> [root at gfs01a glusterfs]# /etc/init.d/nfs status
>>> rpc.svcgssd is stopped
>>> rpc.mountd is stopped
>>> nfsd is stopped
>>> rpc.rquotad is stopped
>>> ------ Original Message ------
>>> From: "Joe Julian" <joe at julianfamily.org
>>> <mailto:joe at julianfamily.org>>
>>> To: "Kaushal M" <kshlmster at gmail.com
>>> <mailto:kshlmster at gmail.com>>; "David F. Robinson"
>>> <david.robinson at corvidtec.com
>>><mailto:david.robinson at corvidtec.com>>
>>> Cc: "Gluster Users" <gluster-users at gluster.org
>>> <mailto:gluster-users at gluster.org>>; "Gluster Devel"
>>> <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
>>> Sent: 1/27/2015 12:48:49 AM
>>> Subject: Re: [Gluster-devel] [Gluster-users] v3.6.2
>>>> If that was true, wouldn't it not "Connection refused" because
>>>> the kernel nfs is listening?
>>>>
>>>> On January 26, 2015 9:43:34 PM PST, Kaushal M
>>>> <kshlmster at gmail.com <mailto:kshlmster at gmail.com>> wrote:
>>>>
>>>> Seems like you have the kernel NFS server running. The
>>>> `rpcinfo -p` output you provided shows that there are other
>>>> mountd, nfs and nlockmgr services running your system.
>>>> Gluster NFS server requires that the kernel nfs services be
>>>> disabled and not running.
>>>>
>>>> ~kaushal
>>>>
>>>> On Tue, Jan 27, 2015 at 10:56 AM, David F. Robinson
>>>> <david.robinson at corvidtec.com
>>>> <mailto:david.robinson at corvidtec.com>> wrote:
>>>>
>>>> [root at gfs01bkp ~]# gluster volume status homegfs_bkp
>>>> Status of volume: homegfs_bkp
>>>> Gluster process
>>>> Port Online Pid
>>>>
>>>>------------------------------__------------------------------__------------------
>>>> Brick
>>>>gfsib01bkp.corvidtec.com:/__data/brick01bkp/homegfs
>>>> _bkp
>>>> 49152 Y 4087
>>>> Brick
>>>>gfsib01bkp.corvidtec.com:/__data/brick02bkp/homegfs
>>>> _bkp
>>>> 49155 Y 4092
>>>> NFS Server on localhost
>>>> N/A N N/A
>>>>
>>>> Task Status of Volume homegfs_bkp
>>>>
>>>>------------------------------__------------------------------__------------------
>>>> Task : Rebalance
>>>> ID : 6d4c6c4e-16da-48c9-9019-__dccb7d2cfd66
>>>> Status : completed
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ------ Original Message ------
>>>> From: "Atin Mukherjee" <amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>>
>>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com
>>>> <mailto:pkarampu at redhat.com>>; "Justin Clift"
>>>> <justin at gluster.org <mailto:justin at gluster.org>>; "David
>>>> F. Robinson" <david.robinson at corvidtec.com
>>>> <mailto:david.robinson at corvidtec.com>>
>>>> Cc: "Gluster Users" <gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>>; "Gluster Devel"
>>>> <gluster-devel at gluster.org
>>>> <mailto:gluster-devel at gluster.org>>
>>>> Sent: 1/26/2015 11:51:13 PM
>>>> Subject: Re: [Gluster-devel] v3.6.2
>>>>
>>>>
>>>>
>>>> On 01/27/2015 07:33 AM, Pranith Kumar Karampuri
>>>>wrote:
>>>>
>>>>
>>>> On 01/26/2015 09:41 PM, Justin Clift wrote:
>>>>
>>>> On 26 Jan 2015, at 14:50, David F. Robinson
>>>> <david.robinson at corvidtec.com
>>>> <mailto:david.robinson at corvidtec.com>>
>>>>wrote:
>>>>
>>>> I have a server with v3.6.2 from which
>>>>I
>>>> cannot mount using NFS. The
>>>> FUSE mount works, however, I cannot get
>>>> the NFS mount to work. From
>>>> /var/log/message:
>>>> Jan 26 09:27:28 gfs01bkp mount[2810]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:27:53 gfs01bkp mount[4456]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:29:28 gfs01bkp mount[2810]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:29:53 gfs01bkp mount[4456]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:31:28 gfs01bkp mount[2810]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:31:53 gfs01bkp mount[4456]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:33:28 gfs01bkp mount[2810]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:33:53 gfs01bkp mount[4456]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:35:28 gfs01bkp mount[2810]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> Jan 26 09:35:53 gfs01bkp mount[4456]:
>>>> mount to NFS server
>>>> 'gfsib01bkp.corvidtec.com
>>>> <http://gfsib01bkp.corvidtec.com/>'
>>>> failed: Connection refused, retrying
>>>> I also am continually getting the
>>>> following errors in
>>>> /var/log/glusterfs:
>>>> [root at gfs01bkp glusterfs]# tail -f
>>>> etc-glusterfs-glusterd.vol.log
>>>> [2015-01-26 14:41:51.260827] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:41:54.261240] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:41:57.261642] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:00.262073] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:03.262504] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:06.262935] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:09.263334] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:12.263761] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:15.264177] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:18.264623] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:21.265053] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>> [2015-01-26 14:42:24.265504] W
>>>> [socket.c:611:__socket_rwv]
>>>> 0-management: readv on
>>>>
>>>>/var/run/__1f0cee5a2d074e39b32ee5a81c70e6__8c.socket
>>>> failed (Invalid
>>>> argument)
>>>>
>>>> I believe this error message comes when the
>>>> socket file is not present.
>>>> I see the following commit which changed the
>>>> location of the sockets.
>>>> May be Atin may know more. about this: +Atin.
>>>>
>>>> Can we get the output of 'gluster volume status' for
>>>> the volume which
>>>> you are trying to mount?
>>>>
>>>> ~Atin
>>>>
>>>> Pranith
>>>>
>>>> ^C
>>>> Also, when I try to NFS mount my
>>>> gluster volume, I am getting
>>>>
>>>> Any chance there's a network or host based
>>>> firewall stopping some of
>>>> the ports?
>>>>
>>>> + Justin
>>>>
>>>> --
>>>> GlusterFS - http://www.gluster.org/
>>>>
>>>> An open source, distributed file system
>>>> scaling to several
>>>> petabytes, and handling thousands of
>>>>clients.
>>>>
>>>> My personal twitter:
>>>> twitter.com/realjustinclift
>>>> <http://twitter.com/realjustinclift>
>>>>
>>>>
>>>>_________________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> <mailto:Gluster-devel at gluster.org>
>>>>
>>>>http://www.gluster.org/__mailman/listinfo/gluster-devel
>>>>
>>>><http://www.gluster.org/mailman/listinfo/gluster-devel>
>>>>
>>>>
>>>>
>>>> _________________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>><mailto:Gluster-users at gluster.org>
>>>> http://www.gluster.org/__mailman/listinfo/gluster-users
>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>
>>>>
>>>>
>>>>------------------------------------------------------------------------
>>>>
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>>>
>>>> --
>>>> Sent from my Android device with K-9 Mail. Please excuse my
>>>>brevity.
>>>
>>>
>>
>>
>>_______________________________________________
>>Gluster-devel mailing list
>>Gluster-devel at gluster.org
>>http://www.gluster.org/mailman/listinfo/gluster-devel
>>
More information about the Gluster-users
mailing list