[Gluster-users] NFS timeouts?

Yannick Perret yannick.perret at liris.cnrs.fr
Thu Dec 1 12:12:54 UTC 2016


Hello,
I have a client machine that mounts as NFS a replicate x2 volume. 
Practicaly this is configured with automount such as:
DIR-NAME -rw,soft,intr server1,server2:/VOLUME

Gluster servers are using 3.6.7.
Sometimes the NFS blocks on client with
server server2 not responding, timed out  (here it was connected on server2)
but network communication is fine beetween the two machines (they are 
connected to the same switch, I can ssh on each, they ping each other…).

I can also see few "xs_tcp_setup_socket: connect returned unhandled 
error -107" on the client.
On 'server2' side I can see in the gluster nfs logs:

[2016-12-01 10:50:15.887927] W [rpcsvc.c:261:rpcsvc_program_actor] 
0-rpc-service: RPC program version not available (req 100003 2)
[2016-12-01 10:50:15.887965] E 
[rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed 
to complete successfully
[2016-12-01 10:50:15.901880] W [rpcsvc.c:261:rpcsvc_program_actor] 
0-rpc-service: RPC program version not available (req 100003 4)
[2016-12-01 10:50:15.901900] E 
[rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed 
to complete successfully
[2016-12-01 10:51:03.777145] W [rpcsvc.c:261:rpcsvc_program_actor] 
0-rpc-service: RPC program version not available (req 100003 2)
[2016-12-01 10:51:03.777191] E 
[rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed 
to complete successfully
[2016-12-01 10:51:03.790561] W [rpcsvc.c:261:rpcsvc_program_actor] 
0-rpc-service: RPC program version not available (req 100003 4)
[2016-12-01 10:51:03.790580] E 
[rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed 
to complete successfully

at a time that correspond to the NFS timeouts.

This problem occurs "often" (at least each day or each 2 days), and 
neither client nor servers are on heavy load (memory and CPU far to be 
full).

Any idea about what can be the reason and how to prevent it to occur?
I reduced the autofs timeout in order to reduce impact but it is not a 
very nice solution… Note: I can't use the glusterfs client instead of 
NFS because of the memory leaks that still exist in it.

Thanks.

Regards,
--
Y.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3369 bytes
Desc: Signature cryptographique S/MIME
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161201/c2f3af3e/attachment.p7s>


More information about the Gluster-users mailing list