[Gluster-users] Lose gnfs connection during test

Xie Changlong xiechanglong.d at gmail.com
Sun Jul 30 12:10:49 UTC 2017


在 7/30/2017 7:53 PM, Xie Changlong 写道:
> Hi all
> 
> I use Distributed-Replicate(12 x 2 = 24) hot tier plus 
> Distributed-Replicate(36 x (6 + 2) = 288) cold tier with gluster3.8.4 
> for performance test. When i set client/server.event-threads as small 
> values etc 2, it works ok. But if set client/server.event-threads as big 
> values etc 32, the netconnects will always become un-available during 
> the test, with following error messages in stree machine.
> 
> 10712 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10713 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10714 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10715 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10716 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10717 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10718 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10719 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10720 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 10721 18:53:28.873 /var/log/messages: Jul 30 18:53:02 localhost_10 
> kernel: nfs: server 10.147.4.99 not responding, still trying
> 
> 
> Here is the error message in nfs.log for gluster:

Add missing logs:

[2017-07-30 18:57:22.734896] I [MSGID: 114047] 
[client-handshake.c:1226:client_setvolume_cbk] 0-ectest_vol-client-289: 
Server and Client lk-version numbers are not same, reopening the fds
[2017-07-30 18:57:22.735781] I [MSGID: 114035] 
[client-handshake.c:201:client_set_lk_version_cbk] 
0-ectest_vol-client-289: Server lk version = 1
[2017-07-30 18:57:22.752957] I [MSGID: 108031] 
[afr-common.c:2255:afr_local_discovery_cbk] 0-ectest_vol-replicate-3: 
selecting local read_child ectest_vol-client-304
[2017-07-30 18:57:22.753083] I [MSGID: 108031] 
[afr-common.c:2255:afr_local_discovery_cbk] 0-ectest_vol-replicate-7: 
selecting local read_child ectest_vol-client-296
[2017-07-30 18:57:22.753328] I [MSGID: 108031] 
[afr-common.c:2255:afr_local_discovery_cbk] 0-ectest_vol-replicate-11: 
selecting local read_child ectest_vol-client-288
[2017-07-30 19:00:26.683584] W [rpcsvc.c:273:rpcsvc_program_actor] 
0-rpc-service: RPC program version not available (req 100003 4) for 
10.147.4.62:874
[2017-07-30 19:00:26.683636] E 
[rpcsvc.c:557:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed 
to complete successfully
[2017-07-30 19:00:38.398918] W [rpcsvc.c:273:rpcsvc_program_actor] 
0-rpc-service: RPC program version not available (req 100003 4) for 
10.147.4.70:717
[2017-07-30 19:00:38.398965] E 
[rpcsvc.c:557:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed 
to complete successfully
[2017-07-30 19:00:51.116479] W [rpcsvc.c:273:rpcsvc_program_actor] 
0-rpc-service: RPC program version not available (req 100003 4) for 
10.147.4.78:710
[2017-07-30 19:00:51.116515] E 
[rpcsvc.c:557:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed 
to complete successfully
[2017-07-30 19:26:12.270581] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
0-mgmt: Volume file changed
[2017-07-30 19:26:12.827112] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
0-mgmt: Volume file changed
[2017-07-30 19:26:12.925761] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
0-mgmt: Volume file changed
[2017-07-30 19:26:13.321652] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
0-mgmt: Volume file changed
[2017-07-30 19:26:18.440268] I [MSGID: 109086] 
[dht-shared.c:297:dht_parse_decommissioned_bricks] 
0-ectest_vol-tier-dht: decommissioning subvolume ectest_vol-hot-dht

> 
>   19:26:18.440498] I [rpc-drc.c:689:rpcsvc_drc_init] 0-rpc-service: DRC 
> is turned OFF
> [2017-07-30 19:26:18.450180] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:26:18.493551] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:26:18.545959] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:42:29.704707] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
> 0-mgmt: Volume file changed
> [2017-07-30 19:42:30.072282] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
> 0-mgmt: Volume file changed
> [2017-07-30 19:42:30.269784] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:42:30.315577] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:42:48.473789] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
> 0-mgmt: Volume file changed
> [2017-07-30 19:42:49.035964] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
> 0-mgmt: Volume file changed
> [2017-07-30 19:42:49.128629] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
> 0-mgmt: Volume file changed
> [2017-07-30 19:42:49.510522] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 
> 0-mgmt: Volume file changed
> [2017-07-30 19:42:49.598007] I [MSGID: 109086] 
> [dht-shared.c:297:dht_parse_decommissioned_bricks] 
> 0-ectest_vol-tier-dht: decommissioning subvolume ectest_vol-hot-dht
> [2017-07-30 19:42:49.598189] I [rpc-drc.c:689:rpcsvc_drc_init] 
> 0-rpc-service: DRC is turned OFF
> [2017-07-30 19:42:49.598601] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 20
> [2017-07-30 19:42:49.610119] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:42:49.610143] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 32
> [2017-07-30 19:42:49.649243] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:42:49.649274] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 18
> [2017-07-30 19:42:49.690013] I [glusterfsd-mgmt.c:1620:mgmt_getspec_cbk] 
> 0-glusterfs: No change in volfile, continuing
> [2017-07-30 19:42:56.539540] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 31
> [2017-07-30 19:42:56.539598] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 16
> [2017-07-30 19:42:56.539616] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 19
> [2017-07-30 19:42:56.539669] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 24
> [2017-07-30 19:42:56.539707] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 17
> [2017-07-30 19:42:56.539770] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 21
> [2017-07-30 19:43:03.358143] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 29
> [2017-07-30 19:43:03.358387] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 27
> [2017-07-30 19:43:03.359097] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 13
> [2017-07-30 19:43:03.359166] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 30
> [2017-07-30 19:43:03.359295] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 10
> [2017-07-30 19:43:03.359370] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 22
> [2017-07-30 19:43:03.359382] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 25
> [2017-07-30 19:43:03.359424] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 28
> [2017-07-30 19:43:03.359373] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 11
> [2017-07-30 19:43:03.359454] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 9
> [2017-07-30 19:43:03.359424] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 26
> [2017-07-30 19:43:03.359391] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 14
> [2017-07-30 19:43:03.359486] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 23
> [2017-07-30 19:43:03.359647] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 7
> [2017-07-30 19:43:03.360095] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 8
> [2017-07-30 19:43:03.360364] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 15
> [2017-07-30 19:43:03.360395] I [MSGID: 101191] 
> [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread 
> with index 12
> 
> 
> Any idea? Thanks in advance.
> 

-- 
Thanks
     -Xie


More information about the Gluster-users mailing list