<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 03/20/2017 06:31 PM, Bernhard Dübi
      wrote:<br>
    </div>
    <blockquote
cite="mid:CACxnGeQ4zsFZhBD-HASq=uLKFqHGF+HRyyi1CCPfYTmyxqZ0DA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div>
            <div>
              <div>Hi Ravi,<br>
                <br>
              </div>
              thank you very much for looking into this<br>
            </div>
            The gluster volumes are used by CommVault Simpana to store
            backup data. Nothing/Nobody should access the underlying
            infrastructure.<br>
            <br>
          </div>
          while looking at the xattrs of the files, I noticed that the
          only difference was the bit-rot.version. So, I assume that
          something in the synchronization of the bit-rot data went
          wrong and having different bit-rot.versions is considered like
          a split-brain situation and access is denied because there is
          no guarantee of correctness. this is just a wild guess.<br>
        </div>
      </div>
    </blockquote>
    Hi Bernhard,<br>
    <br>
    bit-rot version can be different between bricks of the replica when
    I/O is successful only on one brick of the replica when the other
    brick was down. (though AFR self-heal will later heal the contents,
    but not modify bitrot xattrs). So that is not a problem.<br>
    <br>
    <blockquote
cite="mid:CACxnGeQ4zsFZhBD-HASq=uLKFqHGF+HRyyi1CCPfYTmyxqZ0DA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        over the weekend I identified hundreds of files with
        input/output errors. I compared the sha256sum of both bricks,
        they were always the same. I then deleted the affected files
        from gluster and recreated them. this should have fixed the
        issue. Verification is still running.<br>
        <div><br>
        </div>
        <div>if you're interested in the root cause, I can send you more
          log files and the xattrs of some files<br>
        </div>
      </div>
    </blockquote>
    <br>
    If you did not access the underlying bricks directly like you said
    then it could possibly be a bitrot bug. If you don't mind please
    raise a BZ  under the bitrot component and the appropriate gluster
    version with all client and brick logs attached.<br>
    Also if you do have some kind of reproducer, that would help a lot.<br>
    -Ravi<br>
    <br>
    <blockquote
cite="mid:CACxnGeQ4zsFZhBD-HASq=uLKFqHGF+HRyyi1CCPfYTmyxqZ0DA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
          <br>
        </div>
        <div>Best Regards<br>
        </div>
        <div>Bernhard<br>
        </div>
        <div><br>
          <div>
            <div>
              <div>
                <div>
                  <div>
                    <div>
                      <div>
                        <div class="gmail_extra"><br>
                          <div class="gmail_quote">2017-03-20 12:57
                            GMT+01:00 Ravishankar N <span dir="ltr">&lt;<a
                                moz-do-not-send="true"
                                href="mailto:ravishankar@redhat.com"
                                target="_blank">ravishankar@redhat.com</a>&gt;</span>:<br>
                            <blockquote class="gmail_quote"
                              style="margin:0 0 0 .8ex;border-left:1px
                              #ccc solid;padding-left:1ex">
                              <div bgcolor="#FFFFFF" text="#000000">
                                <div
                                  class="m_1080219594149766984moz-cite-prefix">SFILE_CONTAINER_080
                                  is the one which seems to be in
                                  split-brain. SFILE_CONTAINER_046, for
                                  which you have provided the getfattr
                                  output, hard links etc doesn't seem to
                                  be in split-brain.  We do see that the
                                  fops on SFILE_CONTAINER_046 are
                                  failing on the client translator
                                  itself due to EIO:<br>
                                  <tt><br>
                                    [2017-03-17 19:49:56.088867] E
                                    [MSGID: 114031]
                                    [client-rpc-fops.c:444:<wbr>client3_3_open_cbk]
                                    0-Server_Legal_01-client-0: remote
                                    operation failed. Path:
                                    /Server_Legal/CV_MAGNETIC/V_<wbr>944453/CHUNK_9291168/SFILE_<wbr>CONTAINER_046
                                    (bfdfe21a-1af3-474b-a6a4-<wbr>bc0e17edb529)
                                    [Input/output error]</tt><tt><br>
                                  </tt><tt><br>
                                  </tt><tt>[2017-03-17 19:49:56.089012]
                                    E [MSGID: 114031]
                                    [client-rpc-fops.c:444:<wbr>client3_3_open_cbk]
                                    0-Server_Legal_01-client-1: remote
                                    operation failed. Path:
                                    /Server_Legal/CV_MAGNETIC/V_<wbr>944453/CHUNK_9291168/SFILE_<wbr>CONTAINER_046
                                    (bfdfe21a-1af3-474b-a6a4-<wbr>bc0e17edb529)
                                    [Input/output error]</tt><br>
                                  <br>
                                  which is  why the sha256sum on the
                                  mount gave EIO.  And that is because
                                  the file seems to be corrupt on both
                                  bricks because the
                                  'trusted.bit-rot.bad-file' xattr is
                                  set.<br>
                                  <br>
                                  Did you write to the files directly on
                                  the backend? What is interesting is
                                  that the sha256sum is same on both the
                                  bricks despite being both marked as
                                  bad by bitrot.<br>
                                  <br>
                                  -Ravi<br>
                                  <br>
                                  <br>
                                  On 03/18/2017 03:20 AM, Bernhard Dübi
                                  wrote:<br>
                                </div>
                                <blockquote type="cite">
                                  <div dir="ltr">
                                    <div>
                                      <div>Hi,<br>
                                        <br>
                                      </div>
                                      I have a situation<br>
                                      <br>
                                    </div>
                                    the volume logfile reports a
                                    possible split-brain but when I try
                                    to heal it fails because the file is
                                    not in split-brain. Any ideas?<br>
                                    <br>
                                    <br>
                                     
                                    <p class="MsoNormal"><br>
                                    </p>
                                    <p class="MsoNormal"><br>
                                    </p>
                                    <p class="MsoNormal">Regards</p>
                                    <p class="MsoNormal">Bernhard<br>
                                    </p>
                                  </div>
                                  <br>
                                  <fieldset
                                    class="m_1080219594149766984mimeAttachmentHeader"></fieldset>
                                  <br>
                                  <pre>______________________________<wbr>_________________
Gluster-users mailing list
<a moz-do-not-send="true" class="m_1080219594149766984moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" class="m_1080219594149766984moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a></pre>
    </blockquote>
    <p>

    </p>
  </div>

</blockquote></div>
</div></div></div></div></div></div></div></div></div></div>



</blockquote><p>
</p></body></html>