<div><br><div class="gmail_quote"><div dir="auto">On Thu, 27 Jul 2017 at 16:48, Paolo Margara <<a href="mailto:paolo.margara@polito.it">paolo.margara@polito.it</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
Hi Atin,<br>
<br>
in attachment all the requested logs.<br>
<br>
Considering that I'm using gluster as a storage system for oVirt
I've checked also these logs and I've seen that almost every
commands on all the three nodes are executed by the supervdsm daemon
and not only by the SPM node. Could be this the root cause of this
problem?</div></blockquote><div dir="auto"><br></div><div dir="auto">Indeed.</div><div dir="auto"><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><br>
<br>
Greetings,<br>
<br>
Paolo<br>
<br>
PS: could you suggest a better method than attachment for sharing
log files? <br></div><div text="#000000" bgcolor="#FFFFFF">
<br>
<div class="m_-6819027465647094381moz-cite-prefix">Il 26/07/2017 15:28, Atin Mukherjee ha
scritto:<br>
</div>
<blockquote type="cite">
<div>Technically if only one node is pumping all these
status commands, you shouldn't get into this situation. Can you
please help me with the latest cmd_history & glusterd log
files from all the nodes?<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, Jul 26, 2017 at 1:41 PM, Paolo
Margara <span><<a href="mailto:paolo.margara@polito.it" target="_blank">paolo.margara@polito.it</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>Hi Atin,</p>
<p>I've initially disabled gluster status check on all
nodes except on one on my nagios instance as you
recommended but this issue happens again.</p>
<p>So I've disabled it on each nodes but the error happens
again, currently only oVirt is monitoring gluster.</p>
<p>I cannot modify this behaviour in the oVirt GUI, there
is anything that could I do from the gluster prospective
to solve this issue? Considering that 3.8 is near EOL
also upgrading to 3.10 could be an option.</p>
<p><br>
</p>
<p>Greetings,</p>
<p> Paolo<br>
</p>
<div>
<div class="m_-6819027465647094381h5"> <br>
<div class="m_-6819027465647094381m_3345289457272932923moz-cite-prefix">Il
20/07/2017 15:37, Paolo Margara ha scritto:<br>
</div>
<blockquote type="cite">
<p>OK, on my nagios instance I've disabled gluster
status check on all nodes except on one, I'll
check if this is enough.</p>
<p>Thanks,</p>
<p> Paolo<br>
</p>
<br>
<div class="m_-6819027465647094381m_3345289457272932923moz-cite-prefix">Il
20/07/2017 13:50, Atin Mukherjee ha scritto:<br>
</div>
<blockquote type="cite">
<div>So from the cmd_history.logs across
all the nodes it's evident that multiple
commands on the same volume are run
simultaneously which can result into
transactions collision and you can end up with
one command succeeding and others failing.
Ideally if you are running volume status command
for monitoring it's suggested to be run from
only one node.<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Jul 20, 2017 at
3:54 PM, Paolo Margara <span><<a href="mailto:paolo.margara@polito.it" target="_blank">paolo.margara@polito.it</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>In attachment the requested logs for
all the three nodes.</p>
<p>thanks,</p>
<p> Paolo<br>
</p>
<div>
<div class="m_-6819027465647094381m_3345289457272932923h5"> <br>
<div class="m_-6819027465647094381m_3345289457272932923m_6615590896069534251moz-cite-prefix">Il
20/07/2017 11:38, Atin Mukherjee ha
scritto:<br>
</div>
<blockquote type="cite">
<div>Please share the
cmd_history.log file from all the
storage nodes.<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu,
Jul 20, 2017 at 2:34 PM, Paolo
Margara <span><<a href="mailto:paolo.margara@polito.it" target="_blank">paolo.margara@polito.it</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>Hi list,</p>
<p>recently I've noted a
strange behaviour of my
gluster storage, sometimes
while executing a simple
command like "gluster
volume status
vm-images-repo" as a
response I got "Another
transaction is in progress
for vm-images-repo. Please
try again after
sometime.". This situation
does not get solved simply
waiting for but I've to
restart glusterd on the
node that hold (and does
not release) the lock,
this situation occur
randomly after some days.
In the meanwhile, prior
and after the issue
appear, everything is
working as expected.</p>
<p>I'm using gluster 3.8.12
on CentOS 7.3, the only
relevant information that
I found on the log file
(etc-glusterfs-glusterd.vol.log)
of my three nodes are the
following:</p>
<p>* node1, at the moment
the issue begins:</p>
<p>[2017-07-19
15:07:43.130203] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f373f25f00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
[0x7f373f250a25]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
[0x7f373f2f548f] )
0-management: Lock for
vm-images-repo held by
2c6f154f-efe3-4479-addc-b2021aa9d5df<br>
[2017-07-19
15:07:43.128242] I [MSGID:
106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume]
0-management: Received
status volume req for
volume vm-images-repo<br>
[2017-07-19
15:07:43.130244] E [MSGID:
106119]
[glusterd-op-sm.c:3782:glusterd_op_ac_lock]
0-management: Unable to
acquire lock for
vm-images-repo<br>
[2017-07-19
15:07:43.130320] E [MSGID:
106376]
[glusterd-op-sm.c:7775:glusterd_op_sm]
0-management: handler
returned: -1<br>
[2017-07-19
15:07:43.130665] E [MSGID:
106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
0-management: Locking
failed on
virtnode-0-1-gluster.
Please check log file for
details.<br>
[2017-07-19
15:07:43.131293] E [MSGID:
106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
0-management: Locking
failed on
virtnode-0-2-gluster.
Please check log file for
details.<br>
[2017-07-19
15:07:43.131360] E [MSGID:
106151]
[glusterd-syncop.c:1884:gd_sync_task_begin]
0-management: Locking
Peers Failed.<br>
[2017-07-19
15:07:43.132005] E [MSGID:
106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
0-management: Unlocking
failed on
virtnode-0-2-gluster.
Please check log file for
details.<br>
[2017-07-19
15:07:43.132182] E [MSGID:
106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
0-management: Unlocking
failed on
virtnode-0-1-gluster.
Please check log file for
details.</p>
<p>* node2, at the moment
the issue begins:</p>
<p>[2017-07-19
15:07:43.131975] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f17b5b9e00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
[0x7f17b5b8fa25]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
[0x7f17b5c3448f] )
0-management: Lock for
vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16<br>
[2017-07-19
15:07:43.132019] E [MSGID:
106119]
[glusterd-op-sm.c:3782:glusterd_op_ac_lock]
0-management: Unable to
acquire lock for
vm-images-repo<br>
[2017-07-19
15:07:43.133568] W
[glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f17b5b9e00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
[0x7f17b5b8f712]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
[0x7f17b5c3482a] )
0-management: Lock owner
mismatch. Lock for vol
vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16<br>
[2017-07-19
15:07:43.133597] E [MSGID:
106118]
[glusterd-op-sm.c:3845:glusterd_op_ac_unlock]
0-management: Unable to
release lock for
vm-images-repo<br>
The message "E [MSGID:
106376]
[glusterd-op-sm.c:7775:glusterd_op_sm]
0-management: handler
returned: -1" repeated 3
times between [2017-07-19
15:07:42.976193] and
[2017-07-19
15:07:43.133646]<br>
</p>
<p>* node3, at the moment
the issue begins:</p>
<p>[2017-07-19
15:07:42.976593] I [MSGID:
106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume]
0-management: Received
status volume req for
volume vm-images-repo<br>
[2017-07-19
15:07:43.129941] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f6133f5b00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
[0x7f6133f4ca25]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
[0x7f6133ff148f] )
0-management: Lock for
vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16<br>
[2017-07-19
15:07:43.129981] E [MSGID:
106119]
[glusterd-op-sm.c:3782:glusterd_op_ac_lock]
0-management: Unable to
acquire lock for
vm-images-repo<br>
[2017-07-19
15:07:43.130034] E [MSGID:
106376]
[glusterd-op-sm.c:7775:glusterd_op_sm]
0-management: handler
returned: -1<br>
[2017-07-19
15:07:43.130131] E [MSGID:
106275]
[glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn]
0-management: Received
mgmt_v3 lock RJT from
uuid:
2c6f154f-efe3-4479-addc-b2021aa9d5df<br>
[2017-07-19
15:07:43.130710] W
[glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
[0x7f6133f5b00f]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
[0x7f6133f4c712]
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
[0x7f6133ff182a] )
0-management: Lock owner
mismatch. Lock for vol
vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16<br>
[2017-07-19
15:07:43.130733] E [MSGID:
106118]
[glusterd-op-sm.c:3845:glusterd_op_ac_unlock]
0-management: Unable to
release lock for
vm-images-repo<br>
[2017-07-19
15:07:43.130771] E [MSGID:
106376]
[glusterd-op-sm.c:7775:glusterd_op_sm]
0-management: handler
returned: -1</p>
<p>The thing that is really
strange is that in this
case the uuid of node3 is
d9047ecd-26b5-467b-8e91-50f76a0c4d16!</p>
<p>The mapping nodename-uuid
is:</p>
<p>* (node1)
virtnode-0-0-gluster:
2c6f154f-efe3-4479-addc-b2021aa9d5df</p>
<p>* (node2)
virtnode-0-1-gluster:
e93ebee7-5d95-4100-a9df-4a3e60134b73</p>
<p>* (node3)
virtnode-0-2-gluster:
d9047ecd-26b5-467b-8e91-50f76a0c4d16<br>
<br>
</p>
<p>In this case restarting
glusterd on node3 usually
solve the issue.</p>
<p>What could be the root
cause of this behavior?
How can I fix this <span id="m_-6819027465647094381m_3345289457272932923m_6615590896069534251m_5188137498948038144result_box" class="m_-6819027465647094381m_3345289457272932923m_6615590896069534251m_5188137498948038144short_text" lang="en"><span>once and
for all?</span></span></p>
<p><span id="m_-6819027465647094381m_3345289457272932923m_6615590896069534251m_5188137498948038144result_box" class="m_-6819027465647094381m_3345289457272932923m_6615590896069534251m_5188137498948038144short_text" lang="en"><span>If
needed I could provide
the full log file.<br>
</span></span></p>
<p><br>
</p>
<p>Greetings,</p>
<p> Paolo Margara<br>
</p>
</div>
<br>
_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
<fieldset class="m_-6819027465647094381m_3345289457272932923mimeAttachmentHeader"></fieldset>
<br>
<pre>_______________________________________________
Gluster-users mailing list
<a class="m_-6819027465647094381m_3345289457272932923moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a class="m_-6819027465647094381m_3345289457272932923moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div></blockquote></div></div><div dir="ltr">-- <br></div><div class="gmail_signature" data-smartmail="gmail_signature">- Atin (atinm)</div>