[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

Atin Mukherjee amukherj at redhat.com
Sat Jul 29 01:55:07 UTC 2017


On Thu, 27 Jul 2017 at 16:48, Paolo Margara <paolo.margara at polito.it> wrote:

> Hi Atin,
>
> in attachment all the requested logs.
>
> Considering that I'm using gluster as a storage system for oVirt I've
> checked also these logs and I've seen that almost every commands on all the
> three nodes are executed by the supervdsm daemon and not only by the SPM
> node. Could be this the root cause of this problem?
>

Indeed.


>
> Greetings,
>
>     Paolo
>
> PS: could you suggest a better method than attachment for sharing log
> files?
>
> Il 26/07/2017 15:28, Atin Mukherjee ha scritto:
>
> Technically if only one node is pumping all these status commands, you
> shouldn't get into this situation. Can you please help me with the latest
> cmd_history & glusterd log files from all the nodes?
>
> On Wed, Jul 26, 2017 at 1:41 PM, Paolo Margara <paolo.margara at polito.it>
> wrote:
>
>> Hi Atin,
>>
>> I've initially disabled gluster status check on all nodes except on one
>> on my nagios instance as you recommended but this issue happens again.
>>
>> So I've disabled it on each nodes but the error happens again, currently
>> only oVirt is monitoring gluster.
>>
>> I cannot modify this behaviour in the oVirt GUI, there is anything that
>> could I do from the gluster prospective to solve this issue? Considering
>> that 3.8 is near EOL also upgrading to 3.10 could be an option.
>>
>>
>> Greetings,
>>
>>     Paolo
>>
>> Il 20/07/2017 15:37, Paolo Margara ha scritto:
>>
>> OK, on my nagios instance I've disabled gluster status check on all nodes
>> except on one, I'll check if this is enough.
>>
>> Thanks,
>>
>>     Paolo
>>
>> Il 20/07/2017 13:50, Atin Mukherjee ha scritto:
>>
>> So from the cmd_history.logs across all the nodes it's evident that
>> multiple commands on the same volume are run simultaneously which can
>> result into transactions collision and you can end up with one command
>> succeeding and others failing. Ideally if you are running volume status
>> command for monitoring it's suggested to be run from only one node.
>>
>> On Thu, Jul 20, 2017 at 3:54 PM, Paolo Margara <paolo.margara at polito.it>
>> wrote:
>>
>>> In attachment the requested logs for all the three nodes.
>>>
>>> thanks,
>>>
>>>     Paolo
>>>
>>> Il 20/07/2017 11:38, Atin Mukherjee ha scritto:
>>>
>>> Please share the cmd_history.log file from all the storage nodes.
>>>
>>> On Thu, Jul 20, 2017 at 2:34 PM, Paolo Margara <paolo.margara at polito.it>
>>> wrote:
>>>
>>>> Hi list,
>>>>
>>>> recently I've noted a strange behaviour of my gluster storage,
>>>> sometimes while executing a simple command like "gluster volume status
>>>> vm-images-repo" as a response I got "Another transaction is in progress for
>>>> vm-images-repo. Please try again after sometime.". This situation does not
>>>> get solved simply waiting for but I've to restart glusterd on the node that
>>>> hold (and does not release) the lock, this situation occur randomly after
>>>> some days. In the meanwhile, prior and after the issue appear, everything
>>>> is working as expected.
>>>>
>>>> I'm using gluster 3.8.12 on CentOS 7.3, the only relevant information
>>>> that I found on the log file (etc-glusterfs-glusterd.vol.log) of my three
>>>> nodes are the following:
>>>>
>>>> * node1, at the moment the issue begins:
>>>>
>>>> [2017-07-19 15:07:43.130203] W
>>>> [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f373f25f00f]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>> [0x7f373f250a25]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>> [0x7f373f2f548f] ) 0-management: Lock for vm-images-repo held by
>>>> 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>> [2017-07-19 15:07:43.128242] I [MSGID: 106499]
>>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
>>>> Received status volume req for volume vm-images-repo
>>>> [2017-07-19 15:07:43.130244] E [MSGID: 106119]
>>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to acquire
>>>> lock for vm-images-repo
>>>> [2017-07-19 15:07:43.130320] E [MSGID: 106376]
>>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: -1
>>>> [2017-07-19 15:07:43.130665] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>>> failed on virtnode-0-1-gluster. Please check log file for details.
>>>> [2017-07-19 15:07:43.131293] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>>> failed on virtnode-0-2-gluster. Please check log file for details.
>>>> [2017-07-19 15:07:43.131360] E [MSGID: 106151]
>>>> [glusterd-syncop.c:1884:gd_sync_task_begin] 0-management: Locking Peers
>>>> Failed.
>>>> [2017-07-19 15:07:43.132005] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking
>>>> failed on virtnode-0-2-gluster. Please check log file for details.
>>>> [2017-07-19 15:07:43.132182] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking
>>>> failed on virtnode-0-1-gluster. Please check log file for details.
>>>>
>>>> * node2, at the moment the issue begins:
>>>>
>>>> [2017-07-19 15:07:43.131975] W
>>>> [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f17b5b9e00f]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>> [0x7f17b5b8fa25]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>> [0x7f17b5c3448f] ) 0-management: Lock for vm-images-repo held by
>>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.132019] E [MSGID: 106119]
>>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to acquire
>>>> lock for vm-images-repo
>>>> [2017-07-19 15:07:43.133568] W
>>>> [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f17b5b9e00f]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
>>>> [0x7f17b5b8f712]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>>> [0x7f17b5c3482a] ) 0-management: Lock owner mismatch. Lock for vol
>>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.133597] E [MSGID: 106118]
>>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to
>>>> release lock for vm-images-repo
>>>> The message "E [MSGID: 106376] [glusterd-op-sm.c:7775:glusterd_op_sm]
>>>> 0-management: handler returned: -1" repeated 3 times between [2017-07-19
>>>> 15:07:42.976193] and [2017-07-19 15:07:43.133646]
>>>>
>>>> * node3, at the moment the issue begins:
>>>>
>>>> [2017-07-19 15:07:42.976593] I [MSGID: 106499]
>>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
>>>> Received status volume req for volume vm-images-repo
>>>> [2017-07-19 15:07:43.129941] W
>>>> [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f6133f5b00f]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>> [0x7f6133f4ca25]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>> [0x7f6133ff148f] ) 0-management: Lock for vm-images-repo held by
>>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.129981] E [MSGID: 106119]
>>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to acquire
>>>> lock for vm-images-repo
>>>> [2017-07-19 15:07:43.130034] E [MSGID: 106376]
>>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: -1
>>>> [2017-07-19 15:07:43.130131] E [MSGID: 106275]
>>>> [glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn] 0-management:
>>>> Received mgmt_v3 lock RJT from uuid: 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>> [2017-07-19 15:07:43.130710] W
>>>> [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f6133f5b00f]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
>>>> [0x7f6133f4c712]
>>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>>> [0x7f6133ff182a] ) 0-management: Lock owner mismatch. Lock for vol
>>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.130733] E [MSGID: 106118]
>>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to
>>>> release lock for vm-images-repo
>>>> [2017-07-19 15:07:43.130771] E [MSGID: 106376]
>>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: -1
>>>>
>>>> The thing that is really strange is that in this case the uuid of node3
>>>> is d9047ecd-26b5-467b-8e91-50f76a0c4d16!
>>>>
>>>> The mapping nodename-uuid is:
>>>>
>>>> * (node1) virtnode-0-0-gluster: 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>>
>>>> * (node2) virtnode-0-1-gluster: e93ebee7-5d95-4100-a9df-4a3e60134b73
>>>>
>>>> * (node3) virtnode-0-2-gluster: d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>>
>>>> In this case restarting glusterd on node3 usually solve the issue.
>>>>
>>>> What could be the root cause of this behavior? How can I fix this once
>>>> and for all?
>>>>
>>>> If needed I could provide the full log file.
>>>>
>>>>
>>>> Greetings,
>>>>
>>>>     Paolo Margara
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>
>> _______________________________________________
>> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> --
- Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170729/a4b22b3d/attachment.html>


More information about the Gluster-users mailing list