<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>OK, on my nagios instance I've disabled gluster status check on
      all nodes except on one, I'll check if this is enough.</p>
    <p>Thanks,</p>
    <p>    Paolo<br>
    </p>
    <br>
    <div class="moz-cite-prefix">Il 20/07/2017 13:50, Atin Mukherjee ha
      scritto:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAGNCGH29Z-363jsVJj4dg8bRaQRxscFWwjNikXgO+CLTkQmu_Q@mail.gmail.com">
      <div dir="ltr">So from the cmd_history.logs across all the nodes
        it's evident that multiple commands on the same volume are run
        simultaneously which can result into transactions collision and
        you can end up with one command succeeding and others failing.
        Ideally if you are running volume status command for monitoring
        it's suggested to be run from only one node.<br>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Thu, Jul 20, 2017 at 3:54 PM, Paolo
          Margara <span dir="ltr">&lt;<a
              href="mailto:paolo.margara@polito.it" target="_blank"
              moz-do-not-send="true">paolo.margara@polito.it</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <p>In attachment the requested logs for all the three
                nodes.</p>
              <p>thanks,</p>
              <p>    Paolo<br>
              </p>
              <div>
                <div class="h5"> <br>
                  <div class="m_6615590896069534251moz-cite-prefix">Il
                    20/07/2017 11:38, Atin Mukherjee ha scritto:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="ltr">Please share the cmd_history.log file
                      from all the storage nodes.<br>
                    </div>
                    <div class="gmail_extra"><br>
                      <div class="gmail_quote">On Thu, Jul 20, 2017 at
                        2:34 PM, Paolo Margara <span dir="ltr">&lt;<a
                            href="mailto:paolo.margara@polito.it"
                            target="_blank" moz-do-not-send="true">paolo.margara@polito.it</a>&gt;</span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF">
                            <p>Hi list,</p>
                            <p>recently I've noted a strange behaviour
                              of my gluster storage, sometimes while
                              executing a simple command like "gluster
                              volume status vm-images-repo" as a
                              response I got "Another transaction is in
                              progress for vm-images-repo. Please try
                              again after sometime.". This situation
                              does not get solved simply waiting for but
                              I've to restart glusterd on the node that
                              hold (and does not release) the lock, this
                              situation occur randomly after some days.
                              In the meanwhile, prior and after the
                              issue appear, everything is working as
                              expected.</p>
                            <p>I'm using gluster 3.8.12 on CentOS 7.3,
                              the only relevant information that I found
                              on the log file
                              (etc-glusterfs-glusterd.vol.lo<wbr>g) of
                              my three nodes are the following:</p>
                            <p>* node1, at the moment the issue begins:</p>
                            <p>[2017-07-19 15:07:43.130203] W
                              [glusterd-locks.c:572:glusterd<wbr>_mgmt_v3_lock]
                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)
                              [0x7f373f25f00f]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>a25)
                              [0x7f373f250a25]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>48f)
                              [0x7f373f2f548f] ) 0-management: Lock for
                              vm-images-repo held by
                              2c6f154f-efe3-4479-addc-b2021a<wbr>a9d5df<br>
                              [2017-07-19 15:07:43.128242] I [MSGID:
                              106499] [glusterd-handler.c:4349:__glu<wbr>sterd_handle_status_volume]
                              0-management: Received status volume req
                              for volume vm-images-repo<br>
                              [2017-07-19 15:07:43.130244] E [MSGID:
                              106119] [glusterd-op-sm.c:3782:gluster<wbr>d_op_ac_lock]
                              0-management: Unable to acquire lock for
                              vm-images-repo<br>
                              [2017-07-19 15:07:43.130320] E [MSGID:
                              106376] [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]
                              0-management: handler returned: -1<br>
                              [2017-07-19 15:07:43.130665] E [MSGID:
                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]
                              0-management: Locking failed on
                              virtnode-0-1-gluster. Please check log
                              file for details.<br>
                              [2017-07-19 15:07:43.131293] E [MSGID:
                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]
                              0-management: Locking failed on
                              virtnode-0-2-gluster. Please check log
                              file for details.<br>
                              [2017-07-19 15:07:43.131360] E [MSGID:
                              106151] [glusterd-syncop.c:1884:gd_syn<wbr>c_task_begin]
                              0-management: Locking Peers Failed.<br>
                              [2017-07-19 15:07:43.132005] E [MSGID:
                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]
                              0-management: Unlocking failed on
                              virtnode-0-2-gluster. Please check log
                              file for details.<br>
                              [2017-07-19 15:07:43.132182] E [MSGID:
                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]
                              0-management: Unlocking failed on
                              virtnode-0-1-gluster. Please check log
                              file for details.</p>
                            <p>* node2, at the moment the issue begins:</p>
                            <p>[2017-07-19 15:07:43.131975] W
                              [glusterd-locks.c:572:glusterd<wbr>_mgmt_v3_lock]
                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)
                              [0x7f17b5b9e00f]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>a25)
                              [0x7f17b5b8fa25]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>48f)
                              [0x7f17b5c3448f] ) 0-management: Lock for
                              vm-images-repo held by
                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>
                              [2017-07-19 15:07:43.132019] E [MSGID:
                              106119] [glusterd-op-sm.c:3782:gluster<wbr>d_op_ac_lock]
                              0-management: Unable to acquire lock for
                              vm-images-repo<br>
                              [2017-07-19 15:07:43.133568] W
                              [glusterd-locks.c:686:glusterd<wbr>_mgmt_v3_unlock]
                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)
                              [0x7f17b5b9e00f]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>712)
                              [0x7f17b5b8f712]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>82a)
                              [0x7f17b5c3482a] ) 0-management: Lock
                              owner mismatch. Lock for vol
                              vm-images-repo held by
                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>
                              [2017-07-19 15:07:43.133597] E [MSGID:
                              106118] [glusterd-op-sm.c:3845:gluster<wbr>d_op_ac_unlock]
                              0-management: Unable to release lock for
                              vm-images-repo<br>
                              The message "E [MSGID: 106376]
                              [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]
                              0-management: handler returned: -1"
                              repeated 3 times between [2017-07-19
                              15:07:42.976193] and [2017-07-19
                              15:07:43.133646]<br>
                            </p>
                            <p>* node3, at the moment the issue begins:</p>
                            <p>[2017-07-19 15:07:42.976593] I [MSGID:
                              106499] [glusterd-handler.c:4349:__glu<wbr>sterd_handle_status_volume]
                              0-management: Received status volume req
                              for volume vm-images-repo<br>
                              [2017-07-19 15:07:43.129941] W
                              [glusterd-locks.c:572:glusterd<wbr>_mgmt_v3_lock]
                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)
                              [0x7f6133f5b00f]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>a25)
                              [0x7f6133f4ca25]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>48f)
                              [0x7f6133ff148f] ) 0-management: Lock for
                              vm-images-repo held by
                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>
                              [2017-07-19 15:07:43.129981] E [MSGID:
                              106119] [glusterd-op-sm.c:3782:gluster<wbr>d_op_ac_lock]
                              0-management: Unable to acquire lock for
                              vm-images-repo<br>
                              [2017-07-19 15:07:43.130034] E [MSGID:
                              106376] [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]
                              0-management: handler returned: -1<br>
                              [2017-07-19 15:07:43.130131] E [MSGID:
                              106275] [glusterd-rpc-ops.c:876:gluste<wbr>rd_mgmt_v3_lock_peers_cbk_fn]
                              0-management: Received mgmt_v3 lock RJT
                              from uuid: 2c6f154f-efe3-4479-addc-b2021a<wbr>a9d5df<br>
                              [2017-07-19 15:07:43.130710] W
                              [glusterd-locks.c:686:glusterd<wbr>_mgmt_v3_unlock]
                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)
                              [0x7f6133f5b00f]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>712)
                              [0x7f6133f4c712]
                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>82a)
                              [0x7f6133ff182a] ) 0-management: Lock
                              owner mismatch. Lock for vol
                              vm-images-repo held by
                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>
                              [2017-07-19 15:07:43.130733] E [MSGID:
                              106118] [glusterd-op-sm.c:3845:gluster<wbr>d_op_ac_unlock]
                              0-management: Unable to release lock for
                              vm-images-repo<br>
                              [2017-07-19 15:07:43.130771] E [MSGID:
                              106376] [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]
                              0-management: handler returned: -1</p>
                            <p>The thing that is really strange is that
                              in this case the uuid of node3 is
                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16!</p>
                            <p>The mapping nodename-uuid is:</p>
                            <p>* (node1) virtnode-0-0-gluster:
                              2c6f154f-efe3-4479-addc-b2021a<wbr>a9d5df</p>
                            <p>* (node2) virtnode-0-1-gluster:
                              e93ebee7-5d95-4100-a9df-4a3e60<wbr>134b73</p>
                            <p>* (node3) virtnode-0-2-gluster:
                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>
                              <br>
                            </p>
                            <p>In this case restarting glusterd on node3
                              usually solve the issue.</p>
                            <p>What could be the root cause of this
                              behavior? How can I fix this <span
                                id="m_6615590896069534251m_5188137498948038144result_box"
class="m_6615590896069534251m_5188137498948038144short_text" lang="en"><span>once
                                  and for all?</span></span></p>
                            <p><span
                                id="m_6615590896069534251m_5188137498948038144result_box"
class="m_6615590896069534251m_5188137498948038144short_text" lang="en"><span>If
                                  needed I could provide the full log
                                  file.<br>
                                </span></span></p>
                            <p><br>
                            </p>
                            <p>Greetings,</p>
                            <p>    Paolo Margara<br>
                            </p>
                          </div>
                          <br>
                          ______________________________<wbr>_________________<br>
                          Gluster-users mailing list<br>
                          <a href="mailto:Gluster-users@gluster.org"
                            target="_blank" moz-do-not-send="true">Gluster-users@gluster.org</a><br>
                          <a
                            href="http://lists.gluster.org/mailman/listinfo/gluster-users"
                            rel="noreferrer" target="_blank"
                            moz-do-not-send="true">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
                        </blockquote>
                      </div>
                    </div>
                  </blockquote>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
  </body>
</html>