[Bugs] [Bug 1740494] New: Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked

Tue Aug 13 05:59:47 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1740494

            Bug ID: 1740494
           Summary: Fencing: Added the tcmu-runner ALUA feature support
                    but after one of node is rebooted the glfs_file_lock()
                    get stucked
           Product: GlusterFS
           Version: 6
            Status: NEW
         Component: locks
          Assignee: spalai at redhat.com
          Reporter: spalai at redhat.com
                CC: atumball at redhat.com, bugs at gluster.org,
                    prasanna.kalever at redhat.com, spalai at redhat.com,
                    xiubli at redhat.com
        Depends On: 1717824
  Target Milestone: ---
    Classification: Community

+++ This bug was initially created as a clone of Bug #1717824 +++

Description of problem:

In Glusterfs, we have support the fencing feature support. With this we can
suppor the ALUA feature in LIO/TCMU now.

The fencing doc:
https://review.gluster.org/#/c/glusterfs-specs/+/21925/6/accepted/fencing.md

The fencing test example:
https://review.gluster.org/#/c/glusterfs/+/21496/12/tests/basic/fencing/fence-basic.c

The LIO/tcmu-runner PR of supporting the ALUA is :
https://github.com/open-iscsi/tcmu-runner/pull/554.

But currently when testing it based the above PR in tcmu-runner by shutting
down of the HA node, and start it after 2~3 minutes, in all the HA nodes we can
see that the glfs_file_lock() get stucked, the following is from the
/var/log/tcmu-runner.log:

====
2019-06-06 13:50:15.755 1316 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3:
lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:50:15.757 1316 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3:
lock call done. lock state 1
2019-06-06 13:50:55.845 1316 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4:
lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:50:55.847 1316 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block4:
lock call done. lock state 1
2019-06-06 13:57:50.102 1315 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3:
lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:57:50.103 1315 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3:
lock call done. lock state 1
2019-06-06 13:57:50.121 1315 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4:
lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 13:57:50.132 1315 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block4:
lock call done. lock state 1
2019-06-06 14:09:03.654 1328 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block3:
lock call state 2 retries 0. tag 65535 reopen 0
2019-06-06 14:09:03.662 1328 [DEBUG] tcmu_acquire_dev_lock:440 glfs/block3:
lock call done. lock state 1
2019-06-06 14:09:06.700 1328 [DEBUG] tcmu_acquire_dev_lock:388 glfs/block4:
lock call state 2 retries 0. tag 65535 reopen 0
====

The lock operation is never returned.

I am using the following glusterfs built by myself:

# rpm -qa|grep glusterfs
glusterfs-extra-xlators-7dev-0.0.el7.x86_64
glusterfs-api-devel-7dev-0.0.el7.x86_64
glusterfs-7dev-0.0.el7.x86_64
glusterfs-server-7dev-0.0.el7.x86_64
glusterfs-cloudsync-plugins-7dev-0.0.el7.x86_64
glusterfs-resource-agents-7dev-0.0.el7.noarch
glusterfs-api-7dev-0.0.el7.x86_64
glusterfs-devel-7dev-0.0.el7.x86_64
glusterfs-regression-tests-7dev-0.0.el7.x86_64
glusterfs-gnfs-7dev-0.0.el7.x86_64
glusterfs-client-xlators-7dev-0.0.el7.x86_64
glusterfs-geo-replication-7dev-0.0.el7.x86_64
glusterfs-debuginfo-7dev-0.0.el7.x86_64
glusterfs-fuse-7dev-0.0.el7.x86_64
glusterfs-events-7dev-0.0.el7.x86_64
glusterfs-libs-7dev-0.0.el7.x86_64
glusterfs-cli-7dev-0.0.el7.x86_64
glusterfs-rdma-7dev-0.0.el7.x86_64

How reproducible:
30%.

Steps to Reproduce:
1. create one rep volume(HA >= 2) with the mandantary lock enabled
2. create one gluster-blockd target
3. login and do the fio in the client node
4. shutdown one of the HA nodes, and wait 2 ~3 minutes and start it again

Actual results:
all the time the fio couldn't recovery and the rw BW will be 0kb/s, and we can
see tons of log from /var/log/tcmu-runnner.log file:

2019-06-06 15:01:06.641 1328 [DEBUG] alua_implicit_transition:561 glfs/block4:
Lock acquisition operation is already in process.
2019-06-06 15:01:06.648 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353
glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.648 1328 [DEBUG] alua_implicit_transition:561 glfs/block4:
Lock acquisition operation is already in process.
2019-06-06 15:01:06.655 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353
glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.655 1328 [DEBUG] alua_implicit_transition:561 glfs/block4:
Lock acquisition operation is already in process.
2019-06-06 15:01:06.661 1328 [DEBUG_SCSI_CMD] tcmu_cdb_print_info:353
glfs/block4: 28 0 0 3 1f 80 0 0 8 0 
2019-06-06 15:01:06.662 1328 [DEBUG] alua_implicit_transition:561 glfs/block4:
Lock acquisition operation is already in process.

Expected results:
just before the shutdown node is up, the fio could be recovery.

--- Additional comment from Xiubo Li on 2019-06-06 14:39:50 MVT ---

--- Additional comment from Xiubo Li on 2019-06-06 14:40:16 MVT ---

--- Additional comment from Xiubo Li on 2019-06-06 14:42:10 MVT ---

The bt output from the gbd:

[root at rhel1 ~]# gdb -p 1325

(gdb) bt
#0  0x00007fc7761baf47 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007fc7773de468 in event_dispatch_epoll (event_pool=0x559f03d4b560) at
event-epoll.c:847
#2  0x0000559f02419658 in main (argc=21, argv=0x7fff9c6722c8) at
glusterfsd.c:2871
(gdb)

[root at rhel3 ~]# gdb -p 7669

(gdb) bt
#0  0x00007fac80bd9f47 in pthread_join () from /usr/lib64/libpthread.so.0
#1  0x00007fac81dfd468 in event_dispatch_epoll (event_pool=0x55de6f845560) at
event-epoll.c:847
#2  0x000055de6f143658 in main (argc=21, argv=0x7ffcafc3eff8) at
glusterfsd.c:2871
(gdb)

The pl_inode->fop_wind_count is:

(gdb) thread 2
[Switching to thread 2 (Thread 0x7fc742184700 (LWP 1829))]
#0  0x00007fc7761bd965 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
(gdb) frame 2
#2  0x00007fc76379c13b in pl_lk (frame=frame at entry=0x7fc750001128,
this=this at entry=0x7fc75c0128f0, 
    fd=fd at entry=0x7fc73c0977d8, cmd=cmd at entry=6,
flock=flock at entry=0x7fc73c076938, 
    xdata=xdata at entry=0x7fc73c071828) at posix.c:2637
2637                    ret = pl_lock_preempt(pl_inode, reqlock);
(gdb) p pl_inode->fop_wind_count
$1 = -30
(gdb)

The pstack logs please see the attachments

Thanks.

BRs

--- Additional comment from Susant Kumar Palai on 2019-06-10 12:10:33 MVT ---

Just a small update: There are cases where fop_wind_count can go -ve. A basic
fix will be never to bring its value down if it is zero. I will update more on
this later as I am busy with a few other issues ATM.

Susant

--- Additional comment from Xiubo Li on 2019-07-17 13:15:51 MVT ---

Hi Susant,

Is there any new update about this ?

Thanks.

--- Additional comment from Susant Kumar Palai on 2019-07-17 13:20:56 MVT ---

(In reply to Xiubo Li from comment #5)
> Hi Susant,
> 
> Is there any new update about this ?
> 
> Thanks.

Hey Xiubo, most likely will be sending a patch by end of day today.

--- Additional comment from Xiubo Li on 2019-07-17 13:22:28 MVT ---

(In reply to Susant Kumar Palai from comment #6)
> (In reply to Xiubo Li from comment #5)
> > Hi Susant,
> > 
> > Is there any new update about this ?
> > 
> > Thanks.
> 
> Hey Xiubo, most likely will be sending a patch by end of day today.

Sure and take your time Susant please :-)
Thanks very much.
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-17 14:13:59 MVT ---

Moved to POST by mistake. Resetting.

--- Additional comment from Worker Ant on 2019-07-22 14:59:10 MVT ---

REVIEW: https://review.gluster.org/23088 (locks/fencing: Address while lock
preemption) posted (#1) for review on master by Susant Palai

--- Additional comment from Susant Kumar Palai on 2019-07-22 15:02:07 MVT ---

Xiubo, if you could test it out, it would be great. (Make sure you enable
fencing before you create any client)

--- Additional comment from Xiubo Li on 2019-07-22 15:42:38 MVT ---

(In reply to Susant Kumar Palai from comment #10)
> Xiubo, if you could test it out, it would be great. (Make sure you enable
> fencing before you create any client)

@Susant,

Yeah, thanks very much for your work on this. And I will test it late today or
tomorrow.

BRs
Xiubo

--- Additional comment from Xiubo Li on 2019-07-23 07:44:00 MVT ---

@Susant,

There 2 issues are found:

1, From my test the glfs_file_lock() sometimes it will takes around 43 seconds,
is it normal ? And why ?

 90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
lxb--------------glfs_file_lock start
 ...
 319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
lxb--------------glfs_file_lock end

2, After the lock is broke and all the FIOs callback will return -1, the
-EPERM, not the -EBUSY as we discussed before. Is there any change about the
return value ? I am only checking the -EBUSY and -ENOTCONN then only after that
the lock state in local tcmu node will be changed. Or the local state in
tcmu-runner service will always in LOCKED state, but it actually already lost
the lock and should be in UNLOCKED state, so all the IOs will fail.

Thanks,
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-23 10:45:01 MVT ---

(In reply to Xiubo Li from comment #12)
> @Susant,
> 
> There 2 issues are found:
> 
> 
> 1, From my test the glfs_file_lock() sometimes it will takes around 43
> seconds, is it normal ? And why ?
> 
>  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> lxb--------------glfs_file_lock start
>  ...
>  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> lxb--------------glfs_file_lock end

I wonder if it is related to draining of fops. Let me do some testing around
this.

> 
> 2, After the lock is broke and all the FIOs callback will return -1, the
> -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> that the lock state in local tcmu node will be changed. Or the local state
> in tcmu-runner service will always in LOCKED state, but it actually already
> lost the lock and should be in UNLOCKED state, so all the IOs will fail.

This is interesting. Will get back after some code checking.

> 
> 
> 
> Thanks,
> BRs

--- Additional comment from Xiubo Li on 2019-07-23 10:47:15 MVT ---

(In reply to Susant Kumar Palai from comment #13)
> (In reply to Xiubo Li from comment #12)
> > @Susant,
> > 
> > There 2 issues are found:
> > 
> > 
> > 1, From my test the glfs_file_lock() sometimes it will takes around 43
> > seconds, is it normal ? And why ?
> > 
> >  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> > lxb--------------glfs_file_lock start
> >  ...
> >  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> > lxb--------------glfs_file_lock end
> 
> I wonder if it is related to draining of fops. Let me do some testing around
> this.
> 

Sure.

> 
> 
> > 
> > 2, After the lock is broke and all the FIOs callback will return -1, the
> > -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> > return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> > that the lock state in local tcmu node will be changed. Or the local state
> > in tcmu-runner service will always in LOCKED state, but it actually already
> > lost the lock and should be in UNLOCKED state, so all the IOs will fail.
> 
> This is interesting. Will get back after some code checking.
> 

Please take your time @Susant.

Thanks,
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-23 12:44:22 MVT ---

(In reply to Xiubo Li from comment #12)
> @Susant,
> 
> There 2 issues are found:
> 
> 
> 1, From my test the glfs_file_lock() sometimes it will takes around 43
> seconds, is it normal ? And why ?
> 
>  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> lxb--------------glfs_file_lock start
>  ...
>  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> lxb--------------glfs_file_lock end

Checked the time taken for file_lock and it completes immediately for me.

    ret = glfs_fsetxattr(fd1, GF_ENFORCE_MANDATORY_LOCK, "set", 8, 0);
    if (ret < 0) {
        LOG_ERR("glfs_fsetxattr", errno);
        ret = -1;
        goto out;
    }

    time(&before);
    /* take a write mandatory lock */
    ret = glfs_file_lock(fd1, F_SETLKW, &lock, GLFS_LK_MANDATORY);
    if (ret) {
        LOG_ERR("glfs_file_lock", errno);
        goto out;
    }
    time(&after);
    diff = (unsigned long )after - before;
    fprintf(fp, "time %lu %lu %lu\n", diff, before, after);

time 0 1563867824 1563867824

Can you attach the brick log here when you run the test next time?

> 
> 2, After the lock is broke and all the FIOs callback will return -1, the
> -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> that the lock state in local tcmu node will be changed. Or the local state
> in tcmu-runner service will always in LOCKED state, but it actually already
> lost the lock and should be in UNLOCKED state, so all the IOs will fail.

Please attach the brick log after enabling trace logging. <gluster v set <vol>
brick-log-level TRACE>
> 
> 
> 
> Thanks,
> BRs

--- Additional comment from Xiubo Li on 2019-07-23 12:48:58 MVT ---

(In reply to Susant Kumar Palai from comment #15)
> (In reply to Xiubo Li from comment #12)
> > @Susant,
> > 
> > There 2 issues are found:
> > 
> > 
> > 1, From my test the glfs_file_lock() sometimes it will takes around 43
> > seconds, is it normal ? And why ?
> > 
> >  90456 2019-07-23 10:08:57.444 31411 [INFO] tcmu_glfs_lock:901 glfs/block0:
> > lxb--------------glfs_file_lock start
> >  ...
> >  319959 2019-07-23 10:09:40.183 31411 [INFO] tcmu_glfs_lock:905 glfs/block0:
> > lxb--------------glfs_file_lock end
> 
> Checked the time taken for file_lock and it completes immediately for me.
> 
>     ret = glfs_fsetxattr(fd1, GF_ENFORCE_MANDATORY_LOCK, "set", 8, 0);
>     if (ret < 0) {
>         LOG_ERR("glfs_fsetxattr", errno);
>         ret = -1;
>         goto out;
>     }
> 
>     time(&before);
>     /* take a write mandatory lock */
>     ret = glfs_file_lock(fd1, F_SETLKW, &lock, GLFS_LK_MANDATORY);
>     if (ret) {
>         LOG_ERR("glfs_file_lock", errno);
>         goto out;
>     }
>     time(&after);
>     diff = (unsigned long )after - before;
>     fprintf(fp, "time %lu %lu %lu\n", diff, before, after);
> 
> time 0 1563867824 1563867824
> 
> Can you attach the brick log here when you run the test next time?
> 
> > 
> > 2, After the lock is broke and all the FIOs callback will return -1, the
> > -EPERM, not the -EBUSY as we discussed before. Is there any change about the
> > return value ? I am only checking the -EBUSY and -ENOTCONN then only after
> > that the lock state in local tcmu node will be changed. Or the local state
> > in tcmu-runner service will always in LOCKED state, but it actually already
> > lost the lock and should be in UNLOCKED state, so all the IOs will fail.
> 
> Please attach the brick log after enabling trace logging. <gluster v set
> <vol> brick-log-level TRACE>
> > 
> > 
> > 

Sure, I will do that after my current work handy, possibly late today or
tomorrow morning.
Thanks
BRs

--- Additional comment from Susant Kumar Palai on 2019-07-25 15:05:47 MVT ---

On the permission denied: 

I did not see any error related to EPERM but saw EBUSY in the brick logs.

[2019-07-24 08:15:22.236283] E [MSGID: 101191]
[event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
[2019-07-24 08:15:46.083306] E [MSGID: 115068]
[server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 29: READV 0
(7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.088292] E [MSGID: 115068]
[server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 31: READV 0
(7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.119463] E [MSGID: 115068]
[server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 33: READV 0
(7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.124067] E [MSGID: 115068]
[server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 35: READV 0
(7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.294554] E [MSGID: 115068]
[server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 37: READV 0
(7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
error-xlator: repvol3-locks [Resource temporarily unavailable]
[2019-07-24 08:15:46.298672] E [MSGID: 115068]
[server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 39: READV 0
(7db899f8-bf56-4b

Is it possible that the lower layer is converting the errnos to EPERM? Can you
check gfapi logs and tcmu logs for corresponding error messages and confirm?

--- Additional comment from Xiubo Li on 2019-07-25 15:39:23 MVT ---

(In reply to Susant Kumar Palai from comment #17)
> On the permission denied: 
> 
> I did not see any error related to EPERM but saw EBUSY in the brick logs.
> 
> 
> [2019-07-24 08:15:22.236283] E [MSGID: 101191]
> [event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
> handler
> [2019-07-24 08:15:46.083306] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 29: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.088292] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 31: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.119463] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 33: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.124067] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 35: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.294554] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 37: READV 0
> (7db899f8-bf56-4b89-a4c6-90235e8c720a), client:
> CTX_ID:024a059c-7be1-4a19-ba27-8624c6e9c
> c9c-GRAPH_ID:0-PID:9399-HOST:rhel3-PC_NAME:repvol3-client-2-RECON_NO:-0,
> error-xlator: repvol3-locks [Resource temporarily unavailable]
> [2019-07-24 08:15:46.298672] E [MSGID: 115068]
> [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-repvol3-server: 39: READV 0
> (7db899f8-bf56-4b
> 
> 
> Is it possible that the lower layer is converting the errnos to EPERM? Can
> you check gfapi logs and tcmu logs for corresponding error messages and
> confirm?

If so maybe the gfapi is doing this. I will sent you the gfapi logs, the EPERM
value comes from the gfapi directly and tcmu-runner do nothing with it.

Checked the gfapi log, it is also full of:

[2019-07-24 08:23:41.042339] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042381] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042556] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042574] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042655] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042671] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042709] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042722] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-0: remote
operation failed [Device or resource busy]
[2019-07-24 08:23:41.042784] W [MSGID: 114031]
[client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-repvol3-client-1: remote
operation failed [Device or resource busy]

Checked the gfapi source code:

 677 out:
 678     if (rsp.op_ret == -1) {   
 679         gf_msg(this->name, GF_LOG_WARNING,
gf_error_to_errno(rsp.op_errno),                                                
 680                PC_MSG_REMOTE_OP_FAILED, "remote operation failed");
 681     } else if (rsp.op_ret >= 0) {
 682         if (local->attempt_reopen)
 683             client_attempt_reopen(local->fd, this);
 684     }
 685     CLIENT_STACK_UNWIND(writev, frame, rsp.op_ret,
 686                         gf_error_to_errno(rsp.op_errno), &prestat,
&poststat,
 687                         xdata);
 688   
 689     if (xdata)
 690         dict_unref(xdata);   

It seems the return valume is coverted.

Thanks,
BRs

--- Additional comment from Xiubo Li on 2019-07-25 15:42:32 MVT ---

(In reply to Xiubo Li from comment #18)
> (In reply to Susant Kumar Palai from comment #17)
[...]
> 
> Checked the gfapi source code:
> 
>  677 out:
>  678     if (rsp.op_ret == -1) {   

It seems returning the rsp.op_ret here to the callback:

static void glfs_async_cbk(glfs_fd_t *fd, ssize_t ret, void *data)

Not the rsp.op_errno.

>  679         gf_msg(this->name, GF_LOG_WARNING,
> gf_error_to_errno(rsp.op_errno),                                            
> 
>  680                PC_MSG_REMOTE_OP_FAILED, "remote operation failed");
>  681     } else if (rsp.op_ret >= 0) {
>  682         if (local->attempt_reopen)
>  683             client_attempt_reopen(local->fd, this);
>  684     }
>  685     CLIENT_STACK_UNWIND(writev, frame, rsp.op_ret,
>  686                         gf_error_to_errno(rsp.op_errno), &prestat,
> &poststat,
>  687                         xdata);
>  688   
>  689     if (xdata)
>  690         dict_unref(xdata);   
> 
> 
> It seems the return valume is coverted.
> 
> Thanks,
> BRs

--- Additional comment from Xiubo Li on 2019-08-01 06:04:52 MVT ---

When the ret == -1 and then check the errno directly will works for me now.

But I can get both the -EAGAIN and -EBUSY, which only the -EBUSY is expected. 

Then the problem is why there will always be -EAGAIN every time before
acquiring the lock ? 

Thanks
BRs

--- Additional comment from Worker Ant on 2019-08-02 19:27:15 MVT ---

REVIEW: https://review.gluster.org/23088 (locks/fencing: Address hang while
lock preemption) merged (#4) on master by Amar Tumballi

--- Additional comment from Xiubo Li on 2019-08-04 18:03:30 MVT ---

@Susant,

Since the Fencing patch has been into the release 6, so this fixing followed
should be backported, right ?

Thanks.
BRs

--- Additional comment from Susant Kumar Palai on 2019-08-12 14:29:58 MVT ---

(In reply to Xiubo Li from comment #22)
> @Susant,
> 
> Since the Fencing patch has been into the release 6, so this fixing followed
> should be backported, right ?
> 
> Thanks.
> BRs

Will backport to release 6 and 7.

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1717824
[Bug 1717824] Fencing: Added the tcmu-runner ALUA feature support but after one
of node is rebooted the glfs_file_lock() get stucked
-- 
You are receiving this mail because:
You are on the CC list for the bug.