<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>OK, on my nagios instance I've disabled gluster status check on

      all nodes except on one, I'll check if this is enough.</p>

    <p>Thanks,</p>

    <p>    Paolo<br>

    </p>

    <br>

    <div class="moz-cite-prefix">Il 20/07/2017 13:50, Atin Mukherjee ha

      scritto:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAGNCGH29Z-363jsVJj4dg8bRaQRxscFWwjNikXgO+CLTkQmu_Q@mail.gmail.com">

      <div dir="ltr">So from the cmd_history.logs across all the nodes

        it's evident that multiple commands on the same volume are run

        simultaneously which can result into transactions collision and

        you can end up with one command succeeding and others failing.

        Ideally if you are running volume status command for monitoring

        it's suggested to be run from only one node.<br>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On Thu, Jul 20, 2017 at 3:54 PM, Paolo

          Margara <span dir="ltr">&lt;<a

              href="mailto:paolo.margara@polito.it" target="_blank"

              moz-do-not-send="true">paolo.margara@polito.it</a>&gt;</span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div text="#000000" bgcolor="#FFFFFF">

              <p>In attachment the requested logs for all the three

                nodes.</p>

              <p>thanks,</p>

              <p>    Paolo<br>

              </p>

              <div>

                <div class="h5"> <br>

                  <div class="m_6615590896069534251moz-cite-prefix">Il

                    20/07/2017 11:38, Atin Mukherjee ha scritto:<br>

                  </div>

                  <blockquote type="cite">

                    <div dir="ltr">Please share the cmd_history.log file

                      from all the storage nodes.<br>

                    </div>

                    <div class="gmail_extra"><br>

                      <div class="gmail_quote">On Thu, Jul 20, 2017 at

                        2:34 PM, Paolo Margara <span dir="ltr">&lt;<a

                            href="mailto:paolo.margara@polito.it"

                            target="_blank" moz-do-not-send="true">paolo.margara@polito.it</a>&gt;</span>

                        wrote:<br>

                        <blockquote class="gmail_quote" style="margin:0

                          0 0 .8ex;border-left:1px #ccc

                          solid;padding-left:1ex">

                          <div text="#000000" bgcolor="#FFFFFF">

                            <p>Hi list,</p>

                            <p>recently I've noted a strange behaviour

                              of my gluster storage, sometimes while

                              executing a simple command like "gluster

                              volume status vm-images-repo" as a

                              response I got "Another transaction is in

                              progress for vm-images-repo. Please try

                              again after sometime.". This situation

                              does not get solved simply waiting for but

                              I've to restart glusterd on the node that

                              hold (and does not release) the lock, this

                              situation occur randomly after some days.

                              In the meanwhile, prior and after the

                              issue appear, everything is working as

                              expected.</p>

                            <p>I'm using gluster 3.8.12 on CentOS 7.3,

                              the only relevant information that I found

                              on the log file

                              (etc-glusterfs-glusterd.vol.lo<wbr>g) of

                              my three nodes are the following:</p>

                            <p>* node1, at the moment the issue begins:</p>

                            <p>[2017-07-19 15:07:43.130203] W

                              [glusterd-locks.c:572:glusterd<wbr>_mgmt_v3_lock]

                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)

                              [0x7f373f25f00f]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>a25)

                              [0x7f373f250a25]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>48f)

                              [0x7f373f2f548f] ) 0-management: Lock for

                              vm-images-repo held by

                              2c6f154f-efe3-4479-addc-b2021a<wbr>a9d5df<br>

                              [2017-07-19 15:07:43.128242] I [MSGID:

                              106499] [glusterd-handler.c:4349:__glu<wbr>sterd_handle_status_volume]

                              0-management: Received status volume req

                              for volume vm-images-repo<br>

                              [2017-07-19 15:07:43.130244] E [MSGID:

                              106119] [glusterd-op-sm.c:3782:gluster<wbr>d_op_ac_lock]

                              0-management: Unable to acquire lock for

                              vm-images-repo<br>

                              [2017-07-19 15:07:43.130320] E [MSGID:

                              106376] [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]

                              0-management: handler returned: -1<br>

                              [2017-07-19 15:07:43.130665] E [MSGID:

                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]

                              0-management: Locking failed on

                              virtnode-0-1-gluster. Please check log

                              file for details.<br>

                              [2017-07-19 15:07:43.131293] E [MSGID:

                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]

                              0-management: Locking failed on

                              virtnode-0-2-gluster. Please check log

                              file for details.<br>

                              [2017-07-19 15:07:43.131360] E [MSGID:

                              106151] [glusterd-syncop.c:1884:gd_syn<wbr>c_task_begin]

                              0-management: Locking Peers Failed.<br>

                              [2017-07-19 15:07:43.132005] E [MSGID:

                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]

                              0-management: Unlocking failed on

                              virtnode-0-2-gluster. Please check log

                              file for details.<br>

                              [2017-07-19 15:07:43.132182] E [MSGID:

                              106116] [glusterd-mgmt.c:135:gd_mgmt_v<wbr>3_collate_errors]

                              0-management: Unlocking failed on

                              virtnode-0-1-gluster. Please check log

                              file for details.</p>

                            <p>* node2, at the moment the issue begins:</p>

                            <p>[2017-07-19 15:07:43.131975] W

                              [glusterd-locks.c:572:glusterd<wbr>_mgmt_v3_lock]

                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)

                              [0x7f17b5b9e00f]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>a25)

                              [0x7f17b5b8fa25]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>48f)

                              [0x7f17b5c3448f] ) 0-management: Lock for

                              vm-images-repo held by

                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>

                              [2017-07-19 15:07:43.132019] E [MSGID:

                              106119] [glusterd-op-sm.c:3782:gluster<wbr>d_op_ac_lock]

                              0-management: Unable to acquire lock for

                              vm-images-repo<br>

                              [2017-07-19 15:07:43.133568] W

                              [glusterd-locks.c:686:glusterd<wbr>_mgmt_v3_unlock]

                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)

                              [0x7f17b5b9e00f]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>712)

                              [0x7f17b5b8f712]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>82a)

                              [0x7f17b5c3482a] ) 0-management: Lock

                              owner mismatch. Lock for vol

                              vm-images-repo held by

                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>

                              [2017-07-19 15:07:43.133597] E [MSGID:

                              106118] [glusterd-op-sm.c:3845:gluster<wbr>d_op_ac_unlock]

                              0-management: Unable to release lock for

                              vm-images-repo<br>

                              The message "E [MSGID: 106376]

                              [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]

                              0-management: handler returned: -1"

                              repeated 3 times between [2017-07-19

                              15:07:42.976193] and [2017-07-19

                              15:07:43.133646]<br>

                            </p>

                            <p>* node3, at the moment the issue begins:</p>

                            <p>[2017-07-19 15:07:42.976593] I [MSGID:

                              106499] [glusterd-handler.c:4349:__glu<wbr>sterd_handle_status_volume]

                              0-management: Received status volume req

                              for volume vm-images-repo<br>

                              [2017-07-19 15:07:43.129941] W

                              [glusterd-locks.c:572:glusterd<wbr>_mgmt_v3_lock]

                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)

                              [0x7f6133f5b00f]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>a25)

                              [0x7f6133f4ca25]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>48f)

                              [0x7f6133ff148f] ) 0-management: Lock for

                              vm-images-repo held by

                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>

                              [2017-07-19 15:07:43.129981] E [MSGID:

                              106119] [glusterd-op-sm.c:3782:gluster<wbr>d_op_ac_lock]

                              0-management: Unable to acquire lock for

                              vm-images-repo<br>

                              [2017-07-19 15:07:43.130034] E [MSGID:

                              106376] [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]

                              0-management: handler returned: -1<br>

                              [2017-07-19 15:07:43.130131] E [MSGID:

                              106275] [glusterd-rpc-ops.c:876:gluste<wbr>rd_mgmt_v3_lock_peers_cbk_fn]

                              0-management: Received mgmt_v3 lock RJT

                              from uuid: 2c6f154f-efe3-4479-addc-b2021a<wbr>a9d5df<br>

                              [2017-07-19 15:07:43.130710] W

                              [glusterd-locks.c:686:glusterd<wbr>_mgmt_v3_unlock]

                              (--&gt;/usr/lib64/glusterfs/3.8.1<wbr>2/xlator/mgmt/glusterd.so(+0x3<wbr>a00f)

                              [0x7f6133f5b00f]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0x2b<wbr>712)

                              [0x7f6133f4c712]

                              --&gt;/usr/lib64/glusterfs/3.8.12<wbr>/xlator/mgmt/glusterd.so(+0xd0<wbr>82a)

                              [0x7f6133ff182a] ) 0-management: Lock

                              owner mismatch. Lock for vol

                              vm-images-repo held by

                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>

                              [2017-07-19 15:07:43.130733] E [MSGID:

                              106118] [glusterd-op-sm.c:3845:gluster<wbr>d_op_ac_unlock]

                              0-management: Unable to release lock for

                              vm-images-repo<br>

                              [2017-07-19 15:07:43.130771] E [MSGID:

                              106376] [glusterd-op-sm.c:7775:gluster<wbr>d_op_sm]

                              0-management: handler returned: -1</p>

                            <p>The thing that is really strange is that

                              in this case the uuid of node3 is

                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16!</p>

                            <p>The mapping nodename-uuid is:</p>

                            <p>* (node1) virtnode-0-0-gluster:

                              2c6f154f-efe3-4479-addc-b2021a<wbr>a9d5df</p>

                            <p>* (node2) virtnode-0-1-gluster:

                              e93ebee7-5d95-4100-a9df-4a3e60<wbr>134b73</p>

                            <p>* (node3) virtnode-0-2-gluster:

                              d9047ecd-26b5-467b-8e91-50f76a<wbr>0c4d16<br>

                              <br>

                            </p>

                            <p>In this case restarting glusterd on node3

                              usually solve the issue.</p>

                            <p>What could be the root cause of this

                              behavior? How can I fix this <span

                                id="m_6615590896069534251m_5188137498948038144result_box"

class="m_6615590896069534251m_5188137498948038144short_text" lang="en"><span>once

                                  and for all?</span></span></p>

                            <p><span

                                id="m_6615590896069534251m_5188137498948038144result_box"

class="m_6615590896069534251m_5188137498948038144short_text" lang="en"><span>If

                                  needed I could provide the full log

                                  file.<br>

                                </span></span></p>

                            <p><br>

                            </p>

                            <p>Greetings,</p>

                            <p>    Paolo Margara<br>

                            </p>

                          </div>

                          <br>

                          ______________________________<wbr>_________________<br>

                          Gluster-users mailing list<br>

                          <a href="mailto:Gluster-users@gluster.org"

                            target="_blank" moz-do-not-send="true">Gluster-users@gluster.org</a><br>

                          <a

                            href="http://lists.gluster.org/mailman/listinfo/gluster-users"

                            rel="noreferrer" target="_blank"

                            moz-do-not-send="true">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>

                        </blockquote>

                      </div>

                    </div>

                  </blockquote>

                </div>

              </div>

            </div>

          </blockquote>

        </div>

      </div>

    </blockquote>

  </body>

</html>