[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock
Atin Mukherjee
amukherj at redhat.com
Wed Jul 26 13:28:43 UTC 2017
Technically if only one node is pumping all these status commands, you
shouldn't get into this situation. Can you please help me with the latest
cmd_history & glusterd log files from all the nodes?
On Wed, Jul 26, 2017 at 1:41 PM, Paolo Margara <paolo.margara at polito.it>
wrote:
> Hi Atin,
>
> I've initially disabled gluster status check on all nodes except on one on
> my nagios instance as you recommended but this issue happens again.
>
> So I've disabled it on each nodes but the error happens again, currently
> only oVirt is monitoring gluster.
>
> I cannot modify this behaviour in the oVirt GUI, there is anything that
> could I do from the gluster prospective to solve this issue? Considering
> that 3.8 is near EOL also upgrading to 3.10 could be an option.
>
>
> Greetings,
>
> Paolo
>
> Il 20/07/2017 15:37, Paolo Margara ha scritto:
>
> OK, on my nagios instance I've disabled gluster status check on all nodes
> except on one, I'll check if this is enough.
>
> Thanks,
>
> Paolo
>
> Il 20/07/2017 13:50, Atin Mukherjee ha scritto:
>
> So from the cmd_history.logs across all the nodes it's evident that
> multiple commands on the same volume are run simultaneously which can
> result into transactions collision and you can end up with one command
> succeeding and others failing. Ideally if you are running volume status
> command for monitoring it's suggested to be run from only one node.
>
> On Thu, Jul 20, 2017 at 3:54 PM, Paolo Margara <paolo.margara at polito.it>
> wrote:
>
>> In attachment the requested logs for all the three nodes.
>>
>> thanks,
>>
>> Paolo
>>
>> Il 20/07/2017 11:38, Atin Mukherjee ha scritto:
>>
>> Please share the cmd_history.log file from all the storage nodes.
>>
>> On Thu, Jul 20, 2017 at 2:34 PM, Paolo Margara <paolo.margara at polito.it>
>> wrote:
>>
>>> Hi list,
>>>
>>> recently I've noted a strange behaviour of my gluster storage, sometimes
>>> while executing a simple command like "gluster volume status
>>> vm-images-repo" as a response I got "Another transaction is in progress for
>>> vm-images-repo. Please try again after sometime.". This situation does not
>>> get solved simply waiting for but I've to restart glusterd on the node that
>>> hold (and does not release) the lock, this situation occur randomly after
>>> some days. In the meanwhile, prior and after the issue appear, everything
>>> is working as expected.
>>>
>>> I'm using gluster 3.8.12 on CentOS 7.3, the only relevant information
>>> that I found on the log file (etc-glusterfs-glusterd.vol.log) of my
>>> three nodes are the following:
>>>
>>> * node1, at the moment the issue begins:
>>>
>>> [2017-07-19 15:07:43.130203] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f373f25f00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f373f250a25]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>> [0x7f373f2f548f] ) 0-management: Lock for vm-images-repo held by
>>> 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>> [2017-07-19 15:07:43.128242] I [MSGID: 106499]
>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
>>> Received status volume req for volume vm-images-repo
>>> [2017-07-19 15:07:43.130244] E [MSGID: 106119]
>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
>>> acquire lock for vm-images-repo
>>> [2017-07-19 15:07:43.130320] E [MSGID: 106376]
>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned:
>>> -1
>>> [2017-07-19 15:07:43.130665] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on virtnode-0-1-gluster. Please check log file for details.
>>> [2017-07-19 15:07:43.131293] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on virtnode-0-2-gluster. Please check log file for details.
>>> [2017-07-19 15:07:43.131360] E [MSGID: 106151]
>>> [glusterd-syncop.c:1884:gd_sync_task_begin] 0-management: Locking Peers
>>> Failed.
>>> [2017-07-19 15:07:43.132005] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking
>>> failed on virtnode-0-2-gluster. Please check log file for details.
>>> [2017-07-19 15:07:43.132182] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking
>>> failed on virtnode-0-1-gluster. Please check log file for details.
>>>
>>> * node2, at the moment the issue begins:
>>>
>>> [2017-07-19 15:07:43.131975] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f17b5b9e00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f17b5b8fa25]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>> [0x7f17b5c3448f] ) 0-management: Lock for vm-images-repo held by
>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.132019] E [MSGID: 106119]
>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
>>> acquire lock for vm-images-repo
>>> [2017-07-19 15:07:43.133568] W [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f17b5b9e00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2b712) [0x7f17b5b8f712]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>> [0x7f17b5c3482a] ) 0-management: Lock owner mismatch. Lock for vol
>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.133597] E [MSGID: 106118]
>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to
>>> release lock for vm-images-repo
>>> The message "E [MSGID: 106376] [glusterd-op-sm.c:7775:glusterd_op_sm]
>>> 0-management: handler returned: -1" repeated 3 times between [2017-07-19
>>> 15:07:42.976193] and [2017-07-19 15:07:43.133646]
>>>
>>> * node3, at the moment the issue begins:
>>>
>>> [2017-07-19 15:07:42.976593] I [MSGID: 106499]
>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
>>> Received status volume req for volume vm-images-repo
>>> [2017-07-19 15:07:43.129941] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f6133f5b00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f6133f4ca25]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>> [0x7f6133ff148f] ) 0-management: Lock for vm-images-repo held by
>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.129981] E [MSGID: 106119]
>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
>>> acquire lock for vm-images-repo
>>> [2017-07-19 15:07:43.130034] E [MSGID: 106376]
>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned:
>>> -1
>>> [2017-07-19 15:07:43.130131] E [MSGID: 106275]
>>> [glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn]
>>> 0-management: Received mgmt_v3 lock RJT from uuid:
>>> 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>> [2017-07-19 15:07:43.130710] W [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f6133f5b00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2b712) [0x7f6133f4c712]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>> [0x7f6133ff182a] ) 0-management: Lock owner mismatch. Lock for vol
>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.130733] E [MSGID: 106118]
>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to
>>> release lock for vm-images-repo
>>> [2017-07-19 15:07:43.130771] E [MSGID: 106376]
>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned:
>>> -1
>>>
>>> The thing that is really strange is that in this case the uuid of node3
>>> is d9047ecd-26b5-467b-8e91-50f76a0c4d16!
>>>
>>> The mapping nodename-uuid is:
>>>
>>> * (node1) virtnode-0-0-gluster: 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>
>>> * (node2) virtnode-0-1-gluster: e93ebee7-5d95-4100-a9df-4a3e60134b73
>>>
>>> * (node3) virtnode-0-2-gluster: d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>
>>> In this case restarting glusterd on node3 usually solve the issue.
>>>
>>> What could be the root cause of this behavior? How can I fix this once
>>> and for all?
>>>
>>> If needed I could provide the full log file.
>>>
>>>
>>> Greetings,
>>>
>>> Paolo Margara
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170726/60ee2cc1/attachment.html>
More information about the Gluster-users
mailing list