[Bugs] [Bug 1717824] Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked

bugzilla at redhat.com bugzilla at redhat.com
Thu Jul 25 10:39:23 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1717824



--- Comment #18 from Xiubo Li <xiubli at redhat.com> ---
(In reply to Susant Kumar Palai from comment #17)
> On the permission denied: 
> 
> I did not see any error related to EPERM but saw EBUSY in the brick logs.
> 
> 
> [2019-07-24 08:15:22.236283] E [MSGID: 101191]
> [event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
> handler
> [2019-07-24 08:15:46.083306] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 29: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.088292] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 31: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.119463] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 33: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.124067] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 35: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.294554] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 37: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.298672] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 39: READV 0
> (7db899f8-bf56-4b
> 
> 
> Is it possible that the lower layer is converting the errnos to EPERM? Can
> you check gfapi logs and tcmu logs for corresponding error messages and
> confirm?

If so maybe the gfapi is doing this. I will sent you the gfapi logs, the EPERM
value comes from the gfapi directly and tcmu-runner do nothing with it.

Checked the gfapi log, it is also full of:

[2019-07-24 08:23:41.042339] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042381] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042556] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042574] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042655] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042671] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042709] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042722] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042784] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]

Checked the gfapi source code:

 677 out:
 678     if (rsp.op_ret == -1) {   
 679         gf_msg(this->name, GF_LOG_WARNING,
gf_error_to_errno(rsp.op_errno),                                                
 680                PC_MSG_REMOTE_OP_FAILED, "remote operation failed");
 681     } else if (rsp.op_ret >= 0) {
 682         if (local->attempt_reopen)
 683             client_attempt_reopen(local->fd, this);
 684     }
 685     CLIENT_STACK_UNWIND(writev, frame, rsp.op_ret,
 686                         gf_error_to_errno(rsp.op_errno), &prestat,
&poststat,
 687                         xdata);
 688   
 689     if (xdata)
 690         dict_unref(xdata);   


It seems the return valume is coverted.

Thanks,
BRs

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list