[Bugs] [Bug 1390914] Glusterfs create a flock lock by anonymous fd, but can' t release it forever.

bugzilla at redhat.com bugzilla at redhat.com
Thu Nov 3 03:10:10 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1390914

xiaopwu <xiaoping.wu at nokia.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xiaoping.wu at nokia.com



--- Comment #2 from xiaopwu <xiaoping.wu at nokia.com> ---
Attachments analyse as below:
1. below logs are copied from
sn-1mnt-bricks-services-brick.1066.dump.1477882904. the granted flock didn't
released on sn-1, but the flock was release on sn-0. 

[xlator.features.locks.services-locks.inode]
path=/lightcm/locks/nodes.all
mandatory=0
posixlk-count=2
posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 17537,
owner=18a27aa2d1d2944f, client=0x19e6b60, connection-id=(null), granted at
2016-10-31 02:58:07
posixlk.posixlk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 17559,
owner=18764b24e15b0520, client=0x19e6b60, connection-id=(null), blocked at
2016-10-31 02:58:07


2. This flock comes from lmn VM. below logs are copied from
lmn_mnt-services.log.

// application opened "nodes.all" file, the OPEN request was sent to sn-0, but
didn't send to sn-1 because 0-services-client-1 didn't UP. 

[2016-10-31 02:57:50.816359] T [rpc-clnt.c:1381:rpc_clnt_record]
0-services-client-0: Auth Info: pid: 17536, uid: 0, gid: 0, owner:
0000000000000000
[2016-10-31 02:57:50.816371] T [rpc-clnt.c:1238:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 92, payload: 24, rpc hdr: 68
[2016-10-31 02:57:50.816390] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt:
submitted request (XID: 0xc7b7aa Program: GlusterFS 3.3, ProgVers: 330, Proc:
11) to rpc-transport (services-client-0)

//application flock the file. the FLOCK request was sent to sn-0. and got lock
from sn-0 successfully.

[2016-10-31 02:57:50.817424] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt:
submitted request (XID: 0xc7b7ab Program: GlusterFS 3.3, ProgVers: 330, Proc:
26) to rpc-transport (services-client-0)
[2016-10-31 02:57:51.277349] T [rpc-clnt.c:660:rpc_clnt_reply_init]
0-services-client-0: received rpc message (RPC XID: 0xc7b7ab Program: GlusterFS
3.3, ProgVers: 330, Proc: 26) from rpc-transport (services-client-0)

//AFR xlator send FLOCK fop to sn-1 although 0-services-client-1 still didn't
UP. Because AFR didn't open "nodes.all" on SN-1, client xlator sent the fop to
server with a anonymous fd(GF_ANON_FD_NO -2), and got lock from server
successfully too. this lock didn't released.

[2016-10-31 02:57:51.277397] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt:
submitted request (XID: 0xbe5479 Program: GlusterFS 3.3, ProgVers: 330, Proc:
26) to rpc-transport (services-client-1)
[2016-10-31 02:57:51.324258] T [rpc-clnt.c:660:rpc_clnt_reply_init]
0-services-client-1: received rpc message (RPC XID: 0xbe5479 Program: GlusterFS
3.3, ProgVers: 330, Proc: 26) from rpc-transport (services-client-1)

//application released flock, RELEASE fop was sent to sn-0 and lock is released
on sn-0. But the RELEASE fop wasn't sent to sn-1. 

[2016-10-31 02:57:51.404366] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt:
submitted request (XID: 0xc7b7ec Program: GlusterFS 3.3, ProgVers: 330, Proc:
41) to rpc-transport (services-client-0)
[2016-10-31 02:57:51.406273] T [rpc-clnt.c:660:rpc_clnt_reply_init]
0-services-client-0: received rpc message (RPC XID: 0xc7b7ec Program: GlusterFS
3.3, ProgVers: 330, Proc: 41) from rpc-transport (services-client-0)

//notice 0-services-client-0 UP time, it was before "nodes.all" open.

[2016-10-31 02:57:49.802345] I [client-handshake.c:1052:client_post_handshake]
0-services-client-0: 22 fds open - Delaying child_up until they are re-opened
[2016-10-31 02:57:49.983331] I
[client-handshake.c:674:client_child_up_reopen_done] 0-services-client-0: last
fd open'd/lock-self-heal'd - notifying CHILD-UP

//notice 0-services-client-1 UP time, it was after flock release.

[2016-10-31 02:57:51.244367] I [client-handshake.c:1052:client_post_handshake]
0-services-client-1: 21 fds open - Delaying child_up until they are re-opened
[2016-10-31 02:57:51.731500] I
[client-handshake.c:674:client_child_up_reopen_done] 0-services-client-1: last
fd open'd/lock-self-heal'd - notifying CHILD-UP

3. code 

//FLOCK fop was sent to server with a anonymous fd,  if flock file didn't
opened.

client3_3_lk (call_frame_t *frame, xlator_t *this, void *data)
{
}

// RELEASE fop was sent to server, if the file didn't opened.

client3_3_release (call_frame_t *frame, xlator_t *this,
                   void *data)
{
}

4. If any way to fix the issue?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list