[Gluster-users] KVM lockups on Gluster 4.1.1

Dmitry Melekhov dm at belkam.com
Tue Aug 28 05:54:52 UTC 2018


Hello!


Yesterday we hit something like this on 4.1.2

Centos 7.5.


Volume is replicated - two bricks and one arbiter.


We rebooted arbiter, waited for heal end,  and tried to live migrate VM 
to another node ( we run VMs on gluster nodes ):


[2018-08-27 09:56:22.085411] I [MSGID: 115029] 
[server-handshake.c:763:server_setvolume] 0-pool-server: accepted client 
from 
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0 (version: 4.1.2)
[2018-08-27 09:56:22.107609] I [MSGID: 115036] 
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection 
from 
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
client-6-RECON_NO:-0
[2018-08-27 09:56:22.107747] I [MSGID: 101055] 
[client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection 
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
t-6-RECON_NO:-0
[2018-08-27 09:58:37.905829] I [MSGID: 115036] 
[server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection 
from 
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
ool-client-6-RECON_NO:-0
[2018-08-27 09:58:37.905926] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
[2018-08-27 09:58:37.905959] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
[2018-08-27 09:58:37.905979] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
[2018-08-27 09:58:37.905997] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
[2018-08-27 09:58:37.906016] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
[2018-08-27 09:58:37.906034] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
[2018-08-27 09:58:37.906056] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
[2018-08-27 09:58:37.906079] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
[2018-08-27 09:58:37.906098] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
[2018-08-27 09:58:37.906121] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
...

[2018-08-27 09:58:37.907375] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
[2018-08-27 09:58:37.907393] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 
held by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
[2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply] 
0-tcp.pool-server: not connected (priv->connected = -1)
[2018-08-27 09:58:37.907520] E [rpcsvc.c:1378:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0xcb88cb, Program: 
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport 
(tcp.pool-server)
[2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply] 
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) 
[0x7ffb64379084] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) 
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910814] E [rpcsvc.c:1378:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0xcb88ce, Program: 
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport 
(tcp.pool-server)
[2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply] 
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) 
[0x7ffb64379084] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) 
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910904] E [rpcsvc.c:1378:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0xcb88cf, Program: 
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport 
(tcp.pool-server)
[2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply] 
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) 
[0x7ffb64379084] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) 
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.910979] E [rpcsvc.c:1378:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0xcb88d1, Program: 
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport 
(tcp.pool-server)
[2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply] 
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) 
[0x7ffb64379084] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) 
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.911050] E [rpcsvc.c:1378:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0xcb88d8, Program: 
GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport 
(tcp.pool-server)
[2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply] 
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) 
[0x7ffb64379084] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) 
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply] 
(-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) 
[0x7ffb64379084] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
ba) [0x7ffb5fddf5ba] 
-->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) 
[0x7ffb5fd89fce] ) 0-: Reply submission failed
[2018-08-27 09:58:37.916520] I [MSGID: 115013] 
[server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on 
/balamak.img


after this I/O on  /balamak.img was blocked.


Only solution we found was to reboot all 3 nodes.


Is there any bug report in bugzilla we can add logs?

Is it possible to turn of these locks?

Thank you!






More information about the Gluster-users mailing list