[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

Thu Jul 20 09:04:44 UTC 2017

Hi list,

recently I've noted a strange behaviour of my gluster storage, sometimes
while executing a simple command like "gluster volume status
vm-images-repo" as a response I got "Another transaction is in progress
for vm-images-repo. Please try again after sometime.". This situation
does not get solved simply waiting for but I've to restart glusterd on
the node that hold (and does not release) the lock, this situation occur
randomly after some days. In the meanwhile, prior and after the issue
appear, everything is working as expected.

I'm using gluster 3.8.12 on CentOS 7.3, the only relevant information
that I found on the log file (etc-glusterfs-glusterd.vol.log) of my
three nodes are the following:

* node1, at the moment the issue begins:

[2017-07-19 15:07:43.130203] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f373f25f00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
[0x7f373f250a25]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
[0x7f373f2f548f] ) 0-management: Lock for vm-images-repo held by
2c6f154f-efe3-4479-addc-b2021aa9d5df
[2017-07-19 15:07:43.128242] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume vm-images-repo
[2017-07-19 15:07:43.130244] E [MSGID: 106119]
[glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
acquire lock for vm-images-repo
[2017-07-19 15:07:43.130320] E [MSGID: 106376]
[glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: -1
[2017-07-19 15:07:43.130665] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
failed on virtnode-0-1-gluster. Please check log file for details.
[2017-07-19 15:07:43.131293] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
failed on virtnode-0-2-gluster. Please check log file for details.
[2017-07-19 15:07:43.131360] E [MSGID: 106151]
[glusterd-syncop.c:1884:gd_sync_task_begin] 0-management: Locking Peers
Failed.
[2017-07-19 15:07:43.132005] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking
failed on virtnode-0-2-gluster. Please check log file for details.
[2017-07-19 15:07:43.132182] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking
failed on virtnode-0-1-gluster. Please check log file for details.

* node2, at the moment the issue begins:

[2017-07-19 15:07:43.131975] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f17b5b9e00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
[0x7f17b5b8fa25]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
[0x7f17b5c3448f] ) 0-management: Lock for vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16
[2017-07-19 15:07:43.132019] E [MSGID: 106119]
[glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
acquire lock for vm-images-repo
[2017-07-19 15:07:43.133568] W
[glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f17b5b9e00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
[0x7f17b5b8f712]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
[0x7f17b5c3482a] ) 0-management: Lock owner mismatch. Lock for vol
vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
[2017-07-19 15:07:43.133597] E [MSGID: 106118]
[glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to
release lock for vm-images-repo
The message "E [MSGID: 106376] [glusterd-op-sm.c:7775:glusterd_op_sm]
0-management: handler returned: -1" repeated 3 times between [2017-07-19
15:07:42.976193] and [2017-07-19 15:07:43.133646]

* node3, at the moment the issue begins:

[2017-07-19 15:07:42.976593] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume vm-images-repo
[2017-07-19 15:07:43.129941] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f6133f5b00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
[0x7f6133f4ca25]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
[0x7f6133ff148f] ) 0-management: Lock for vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16
[2017-07-19 15:07:43.129981] E [MSGID: 106119]
[glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
acquire lock for vm-images-repo
[2017-07-19 15:07:43.130034] E [MSGID: 106376]
[glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: -1
[2017-07-19 15:07:43.130131] E [MSGID: 106275]
[glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn]
0-management: Received mgmt_v3 lock RJT from uuid:
2c6f154f-efe3-4479-addc-b2021aa9d5df
[2017-07-19 15:07:43.130710] W
[glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f6133f5b00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
[0x7f6133f4c712]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
[0x7f6133ff182a] ) 0-management: Lock owner mismatch. Lock for vol
vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
[2017-07-19 15:07:43.130733] E [MSGID: 106118]
[glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to
release lock for vm-images-repo
[2017-07-19 15:07:43.130771] E [MSGID: 106376]
[glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: -1

The thing that is really strange is that in this case the uuid of node3
is d9047ecd-26b5-467b-8e91-50f76a0c4d16!

The mapping nodename-uuid is:

* (node1) virtnode-0-0-gluster: 2c6f154f-efe3-4479-addc-b2021aa9d5df

* (node2) virtnode-0-1-gluster: e93ebee7-5d95-4100-a9df-4a3e60134b73

* (node3) virtnode-0-2-gluster: d9047ecd-26b5-467b-8e91-50f76a0c4d16

In this case restarting glusterd on node3 usually solve the issue.

What could be the root cause of this behavior? How can I fix this once
and for all?

If needed I could provide the full log file.

Greetings,

    Paolo Margara

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170720/08a2eecf/attachment.html>