[Gluster-users] GlusterFS (3.3.1) - issues with NLM - RPC Program procedure not available

Charles Cooke charles at coupa.com
Thu Jun 6 12:19:42 UTC 2013


Hi Michael,

Thanks for the information - you were right on both counts.   Disabling the
kernel NFS & mounting from a separate instance resolved the issue


On Wed, Jun 5, 2013 at 8:08 AM, Michael Brown <michael at netdirect.ca> wrote:

>  I found that when I was exporting NFS from gluster I had to be careful
> not to load the kernel nfs modules (such as lockd) or things would get
> messed up. I think that's happened to you.
>
> Is gluster lockd in use? Do this:
>
> [root at fearless1 ~]# rpcinfo -p | grep 100021
>
>     100021    4   tcp  38468  nlockmgr
>     100021    1   udp    872  nlockmgr
>     100021    1   tcp    874  nlockmgr
> [root at fearless1 ~]# ss -lp sport = :38468
> State      Recv-Q Send-Q      Local Address:Port          Peer Address:Port
> LISTEN     0      10                      *:38468
> *:*        users:(("glusterfs",28675,13))
>
> Yes it is!
>
> If not:
> * ensure no nfs mounts are on gluster servers
> * rmmod nfs
> * rmmod lockd
> * 'gluster volume volname start force' should ensure all the daemons are
> up.
>
>
> On 13-06-05 10:59 AM, Charles Cooke wrote:
>
> Hello,
>
>  We had some issues with context switches using the GlusterFS FUSE client
> - & we are processing a large number of small files.
>
>  We decided to switch to NFS to see if that would increase performance,
> but are having issues with the NLM locking functionality.
>
>  Using http://wiki.samba.org/index.php/Ping_pong - we are able to verify
> that a single F_SETLKW call is blocking indefinitely.
>
>  ping_pong -r glusterfs_test.txt 1
> Hangs on this line blocking for a wait lock:
>       return fcntl(fd,F_SETLKW,&lock);
>
>  The root cause appears to be this in the glusterfs/nfs.log:
> *RPC Program procedure not available for procedure 5 in NLM4*
> *
> *
> Is there any more information on this error, or how it can be resolved?
>
>  rpcinfo -p
>    program vers proto   port
>     100000    2   tcp    111  portmapper
>     100000    2   udp    111  portmapper
>      100021    3   udp  45739  nlockmgr
>     100021    3   tcp  58923  nlockmgr
>     100005    3   tcp  38465  mountd
>     100005    1   tcp  38466  mountd
>     100003    3   tcp  38467  nfs
>     100021    4   tcp  38468  nlockmgr
>     100024    1   udp    791  status
>      100024    1   tcp    794  status
>     100021    1   udp    783  nlockmgr
>     100021    1   tcp    785  nlockmgr
>
>
>  /var/log/messages
> Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: nlm_lookup_host(0a8b08af,
> p=6, v=4)
> Jun  5 14:53:48 ip-10-139-8-175 kernel: lockd: host garbage collection
> Jun  5 14:53:48 ip-10-139-8-175 kernel: lockd: nlmsvc_mark_resources
> Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: delete host 10.139.8.175
>  Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: nsm_unmonitor(10.139.8.175)
> Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: creating host entry
> Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: get host 10.139.8.175
> Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: nsm_monitor(10.139.8.175)
> Jun  5 10:53:48 ip-10-139-8-175 kernel: nsm: xdr_decode_stat_res status 0
> state 39
>  Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: call procedure 2 on
> 10.139.8.175
> Jun  5 10:53:48 ip-10-139-8-175 kernel: lockd: nlm_bind_host(0a8b08af)
> Jun  5 14:53:48 ip-10-139-8-175 kernel: lockd: server returns status
> 50331648
>
>  /var/log/glusterfs/nfs.log
> [2013-06-05 14:53:48.147647] T [nfs-fops.c:131:nfs_create_frame] 0-nfs:
> uid: 0, gid 0, gids: 7
>  [2013-06-05 14:53:48.147664] T [nfs-fops.c:133:nfs_create_frame] 0-nfs:
> gid: 0
> [2013-06-05 14:53:48.147680] T [nfs-fops.c:133:nfs_create_frame] 0-nfs:
> gid: 1
> [2013-06-05 14:53:48.147695] T [nfs-fops.c:133:nfs_create_frame] 0-nfs:
> gid: 2
> [2013-06-05 14:53:48.147710] T [nfs-fops.c:133:nfs_create_frame] 0-nfs:
> gid: 3
> [2013-06-05 14:53:48.147736] T [nfs-fops.c:133:nfs_create_frame] 0-nfs:
> gid: 4
>  [2013-06-05 14:53:48.147752] T [nfs-fops.c:133:nfs_create_frame] 0-nfs:
> gid: 6
> [2013-06-05 14:53:48.147767] T [nfs-fops.c:133:nfs_create_frame] 0-nfs:
> gid: 10
> [2013-06-05 14:53:48.147798] T [rpc-clnt.c:1308:rpc_clnt_record]
> 0-nfs_test-client-0: Auth Info: pid: 1, uid: 0, gid: 0, owner:
> 304069702d31302d-3133392d382d3137-35
> [2013-06-05 14:53:48.147821] T
> [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
> 196, payload: 88, rpc hdr: 108
> [2013-06-05 14:53:48.147861] T [rpc-clnt.c:1507:rpc_clnt_submit]
> 0-rpc-clnt: submitted request (XID: 0x27x Program: GlusterFS 3.1, ProgVers:
> 330, Proc: 26) to rpc-transport (nfs_test-client-0)
> [2013-06-05 14:53:48.148249] T [rpc-clnt.c:685:rpc_clnt_reply_init]
> 0-nfs_test-client-0: received rpc message (RPC XID: 0x27x Program:
> GlusterFS 3.1, ProgVers: 330, Proc: 26) from rpc-transport
> (nfs_test-client-0)
> [2013-06-05 14:53:48.148293] T [rpc-clnt.c:1308:rpc_clnt_record]
> 0-nfs_test-client-1: Auth Info: pid: 1, uid: 0, gid: 0, owner:
> 304069702d31302d-3133392d382d3137-35
> [2013-06-05 14:53:48.148314] T
> [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
> 196, payload: 88, rpc hdr: 108
> [2013-06-05 14:53:48.148349] T [rpc-clnt.c:1507:rpc_clnt_submit]
> 0-rpc-clnt: submitted request (XID: 0x48x Program: GlusterFS 3.1, ProgVers:
> 330, Proc: 26) to rpc-transport (nfs_test-client-1)
> [2013-06-05 14:53:48.149264] T [rpc-clnt.c:685:rpc_clnt_reply_init]
> 0-nfs_test-client-1: received rpc message (RPC XID: 0x48x Program:
> GlusterFS 3.1, ProgVers: 330, Proc: 26) from rpc-transport
> (nfs_test-client-1)
> [2013-06-05 14:53:48.149300] D [fd-lk.c:456:fd_lk_insert_and_merge]
> 0-fd-lk: new lock requrest: owner = 304069702d31302d-3133392d382d3137-35,
> fl_type = F_WRLCK, fs_start = 0, fs_end = 0, user_flock: l_type = F_WRLCK,
> l_start = 0, l_len = 1
> [2013-06-05 14:53:48.149320] D [fd-lk.c:419:print_lock_list] 0-fd-lk: lock
> list:
> [2013-06-05 14:53:48.149340] D [fd-lk.c:431:print_lock_list] 0-fd-lk:
> owner = 304069702d31302d-3133392d382d3137-35, cmd = F_SETLKW fl_type =
> F_WRLCK, fs_start = 0, fs_end = 0, user_flock: l_type = F_WRLCK, l_start =
> 0, l_len = 1,
> [2013-06-05 14:53:48.149371] T [rpc-clnt.c:1308:rpc_clnt_record]
> 0-NLM-client: Auth Info: pid: 1, uid: 0, gid: 0, owner:
> 304069702d31302d-3133392d382d3137-35
> [2013-06-05 14:53:48.149392] T
> [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen
> 156, payload: 116, rpc hdr: 40
> [2013-06-05 14:53:48.149430] T [rpc-clnt.c:1507:rpc_clnt_submit]
> 0-rpc-clnt: submitted request (XID: 0x19x Program: NLMv4, ProgVers: 4,
> Proc: 5) to rpc-transport (NLM-client)
>  [2013-06-05 14:53:48.149452] T [nfs3.c:447:nfs3_call_state_wipe]
> 0-nfs-nfsv3: fd 0x1c9f53fc ref: 4
> [2013-06-05 14:53:48.149495] T [rpcsvc.c:470:rpcsvc_handle_rpc_call]
> 0-rpcsvc: Client port: 1009
> [2013-06-05 14:53:48.149516] T
> [rpcsvc-auth.c:305:rpcsvc_auth_request_init] 0-rpc-service: Auth handler:
> AUTH_NULL
> [2013-06-05 14:53:48.149533] T [rpcsvc.c:382:rpcsvc_request_create]
> 0-rpc-service: received rpc-message (XID: 0x13, Ver: 2, Program: 100021,
> ProgVers: 4, Proc: 5) from rpc-transport (socket.NLM)
> [2013-06-05 14:53:48.149551] E [rpcsvc.c:203:rpcsvc_program_actor]
> 0-rpc-service: RPC Program procedure not available for procedure 5 in NLM4
> [2013-06-05 14:53:48.149598] D [rpcsvc.c:1113:rpcsvc_error_reply]
> (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x27) [0x2b1874ed5e17]
> (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x16c) [0x2b1874ed442c]
> (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x143)
> [0x2b1874ed40b3]))) 0-: sending a RPC error reply
>
>  --
>  Thank You,
>
>  Charles Cooke
> Senior Cloud Software Engineer
> Coupa Software | www.coupa.com
> charles.cooke at coupa.com
> Cell: 902.452.4858
>
>  <http://www.coupa.com/solutions/new-in-coupa-9>
>
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> Michael Brown               | `One of the main causes of the fall of
> Systems Consultant          | the Roman Empire was that, lacking zero,
> Net Direct Inc.             | they had no way to indicate successful
> ☎: +1 519 883 1172 x5106    | termination of their C programs.' - Firth
>
>


-- 
Thank You,

Charles Cooke
Senior Cloud Software Engineer
Coupa Software | www.coupa.com
charles.cooke at coupa.com
Cell: 902.452.4858

<http://www.coupa.com/solutions/new-in-coupa-9>
<http://www.coupa.com/solutions/new-in-coupa-9>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130606/d9ed0a86/attachment.html>


More information about the Gluster-users mailing list