<div dir="ltr">Hello, we are using glusterfs 3.10.3.<div><br></div><div>We currently have a gluster heal volume full running, the crawl is still running.<br><div><br></div><div><div>Starting time of crawl: Tue Nov 14 15:58:35 2017</div><div><br></div><div>Crawl is in progress</div><div>Type of crawl: FULL</div><div>No. of entries healed: 0</div><div>No. of entries in split-brain: 0</div><div>No. of heal failed entries: 0</div></div></div><div><br></div><div>getfattr from both files:</div><div><br></div><div><div># getfattr -d -m . -e hex /mnt/AIDATA/data//ishmaelb/experiments/omie/omieali/cifar10/donsker_grad_reg_ali_dcgan_stat_dcgan_ac_True/omieali_cifar10_zdim_100_enc_dcgan_dec_dcgan_stat_dcgan_posterior_propagated_enc_beta1.0_dec_beta_1.0_info_metric_donsker_varadhan_info_lam_0.334726025306_222219-23_10_17/data/data_gen_iter_86000.pkl</div><div>getfattr: Removing leading &#39;/&#39; from absolute path names</div><div># file: mnt/AIDATA/data//ishmaelb/experiments/omie/omieali/cifar10/donsker_grad_reg_ali_dcgan_stat_dcgan_ac_True/omieali_cifar10_zdim_100_enc_dcgan_dec_dcgan_stat_dcgan_posterior_propagated_enc_beta1.0_dec_beta_1.0_info_metric_donsker_varadhan_info_lam_0.334726025306_222219-23_10_17/data/data_gen_iter_86000.pkl</div><div>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000</div><div>trusted.afr.data01-client-0=0x000000000000000100000000</div><div>trusted.gfid=0x7e8513f4d4e24e66b0ba2dbe4c803c54</div></div><div><br></div><div><div># getfattr -d -m . -e hex /mnt/AIDATA/data/home/allac/experiments/171023_105655_mini_imagenet_projection_size_mixing_depth_num_filters_filter_size_block_depth_Explore\ architecture\ capacity/Explore\ architecture\ capacity\(projection_size\=32\;mixing_depth\=0\;num_filters\=64\;filter_size\=3\;block_depth\=3\)/model.ckpt-70001.data-00000-of-00001.tempstate1629411508065733704</div><div>getfattr: Removing leading &#39;/&#39; from absolute path names</div><div># file: mnt/AIDATA/data/home/allac/experiments/171023_105655_mini_imagenet_projection_size_mixing_depth_num_filters_filter_size_block_depth_Explore architecture capacity/Explore architecture capacity(projection_size=32;mixing_depth=0;num_filters=64;filter_size=3;block_depth=3)/model.ckpt-70001.data-00000-of-00001.tempstate1629411508065733704</div><div>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000</div><div>trusted.afr.data01-client-0=0x000000000000000000000000</div><div>trusted.bit-rot.version=0x02000000000000005979d278000af1e7</div><div>trusted.gfid=0x9612ecd2106d42f295ebfef495c1d8ab</div></div><div><br></div><div><div><br></div><div># gluster volume heal data01<br></div><div><div>Launching heal operation to perform index self heal on volume data01 has been successful </div><div>Use heal info commands to check status</div></div><div># cat /var/log/glusterfs/glustershd.log<br></div><div><div>[2017-11-12 08:39:01.907287] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing</div><div>[2017-11-15 08:18:02.084766] I [MSGID: 100011] [glusterfsd.c:1414:reincarnate] 0-glusterfsd: Fetching the volume file from server...</div><div>[2017-11-15 08:18:02.085718] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing</div><div>[2017-11-15 19:13:42.005307] W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54&gt; (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]</div><div>The message &quot;W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54&gt; (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]&quot; repeated 5 times between [2017-11-15 19:13:42.005307] and [2017-11-15 19:13:42.166579]</div><div>[2017-11-15 19:23:43.041956] W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54&gt; (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]</div><div>The message &quot;W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54&gt; (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]&quot; repeated 5 times between [2017-11-15 19:23:43.041956] and [2017-11-15 19:23:43.235831]</div><div>[2017-11-15 19:30:22.726808] W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54&gt; (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]</div><div>The message &quot;W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54&gt; (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]&quot; repeated 4 times between [2017-11-15 19:30:22.726808] and [2017-11-15 19:30:22.827631]</div><div>[2017-11-16 15:04:34.102010] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-data01-replicate-0: performing metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab</div><div>[2017-11-16 15:04:34.186781] I [MSGID: 108026] [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0: Completed metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab. sources=[1]  sinks=0 </div><div>[2017-11-16 15:04:38.776070] I [MSGID: 108026] [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0: Completed data selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54. sources=[1]  sinks=0 </div><div>[2017-11-16 15:04:38.811744] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-data01-replicate-0: performing metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54</div><div>[2017-11-16 15:04:38.867474] I [MSGID: 108026] [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0: Completed metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54. sources=[1]  sinks=0 </div></div></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 16, 2017 at 7:14 AM, Ravishankar N <span dir="ltr">&lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF"><div><div class="h5">
    <p><br>
    </p>
    <br>
    <div class="m_-3149634344898588617moz-cite-prefix">On 11/16/2017 04:12 PM, Nithya
      Balachandran wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On 15 November 2017 at 19:57,
            Frederic Harmignies <span dir="ltr">&lt;<a href="mailto:frederic.harmignies@elementai.com" target="_blank">frederic.harmignies@<wbr>elementai.com</a>&gt;</span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div dir="ltr">Hello, we have 2x files that are missing
                from one of the bricks. No idea how to fix this.
                <div><br>
                </div>
                <div>Details:</div>
                <div><br>
                </div>
                <div>
                  <div># gluster volume info</div>
                  <div> </div>
                  <div>Volume Name: data01</div>
                  <div>Type: Replicate</div>
                  <div>Volume ID: 39b4479c-31f0-4696-9435-5454e4<wbr>f8d310</div>
                  <div>Status: Started</div>
                  <div>Snapshot Count: 0</div>
                  <div>Number of Bricks: 1 x 2 = 2</div>
                  <div>Transport-type: tcp</div>
                  <div>Bricks:</div>
                  <div>Brick1: 192.168.186.11:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Brick2: 192.168.186.12:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Options Reconfigured:</div>
                  <div>performance.cache-refresh-time<wbr>out: 30</div>
                  <div>client.event-threads: 16</div>
                  <div>server.event-threads: 32</div>
                  <div>performance.readdir-ahead: off</div>
                  <div>performance.io-thread-count: 32</div>
                  <div>performance.cache-size: 32GB</div>
                  <div>transport.address-family: inet</div>
                  <div>nfs.disable: on</div>
                  <div>features.trash: off</div>
                  <div>features.trash-max-filesize: 500MB</div>
                </div>
                <div><br>
                </div>
                <div>
                  <div># gluster volume heal data01 info</div>
                  <div>Brick 192.168.186.11:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Status: Connected</div>
                  <div>Number of entries: 0</div>
                  <div><br>
                  </div>
                  <div>Brick 192.168.186.12:/mnt/AIDATA/dat<wbr>a</div>
                  <div>&lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt; </div>
                  <div>&lt;gfid:9612ecd2-106d-42f2-95eb-<wbr>fef495c1d8ab&gt; </div>
                  <div>Status: Connected</div>
                  <div>Number of entries: 2</div>
                  <div><br>
                  </div>
                  <div># gluster volume heal data01 info split-brain</div>
                  <div>Brick 192.168.186.11:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Status: Connected</div>
                  <div>Number of entries in split-brain: 0</div>
                  <div><br>
                  </div>
                  <div>Brick 192.168.186.12:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Status: Connected</div>
                  <div>Number of entries in split-brain: 0</div>
                </div>
                <div><br>
                </div>
                <div>
                  <div><br>
                  </div>
                </div>
                <div>Both files is missing from the folder on Brick1,
                  the gfid files are also missing in the .gluster folder
                  on that same Brick1.</div>
                <div>Brick2 has both the files and the gfid file in
                  .gluster</div>
                <div><br>
                </div>
                <div>We already tried:</div>
                <div><br>
                </div>
                <div> #gluster heal volume full</div>
                <div>Running a stat and ls -l on both files from a
                  mounted client to try and trigger a heal</div>
                <div><br>
                </div>
                <div>Would a re-balance fix this? Any guidance would be
                  greatly appreciated!</div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>A rebalance would not help here as this is a replicate
              volume. Ravi, any idea what could be going wrong here? <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote></div></div>
    No, explicit lookup should have healed the file on the missing
    brick. Unless lookup did not hit afr and is served from caching
    translators.<br>
    Frederic, what version of gluster are you running? Can you launch
    &#39;gluster heal volume&#39; and see glustershd logs for possible warnings?
    Use DEBUG client-log-level if you have to.  Also, instead of stat,
    try a getfattr on the file from the mount.<br>
    <br>
    -Ravi<span class=""><br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>Regards,</div>
            <div>Nithya</div>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div dir="ltr">
                <div><br>
                </div>
                <div>Thank you in advance!</div>
                <span class="m_-3149634344898588617HOEnZb"><font color="#888888">
                    <div><br>
                    </div>
                    <div>
                      <div>-- <br>
                        <div class="m_-3149634344898588617m_-3765909141665328534gmail_signature">
                          <div dir="ltr">
                            <div dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><b style="font-size:12.8px"><img src="https://drive.google.com/uc?id=0B-sFtqOxQE9UeW0wTXAwTVc5Mkk&amp;export=download" width="96" height="12"><br>
                              </b></div>
                            <div dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><b style="font-size:12.8px">Frederic
                                Harmignies</b>
                              <div style="font-size:12.8px"><i style="font-size:12.8px">High
                                  Performance Computer Administrator</i></div>
                              <div style="font-size:12.8px"><br>
                              </div>
                              <div style="font-size:12.8px"><a href="http://www.elementai.com/" style="color:rgb(17,85,204)" target="_blank"><font color="#888888">www.elementai.com</font></a></div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </font></span></div>
              <br>
              ______________________________<wbr>_________________<br>
              Gluster-users mailing list<br>
              <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
              <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </span></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><b style="font-size:12.8px"><img width="96" height="12" src="https://drive.google.com/uc?id=0B-sFtqOxQE9UeW0wTXAwTVc5Mkk&amp;export=download"><br></b></div><div dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><div style="font-size:12.8px"></div><b style="font-size:12.8px">Frederic Harmignies</b><div style="font-size:12.8px"><i style="font-size:12.8px">High Performance Computer Administrator</i></div><div style="font-size:12.8px"><br></div><div style="font-size:12.8px"><a href="http://www.elementai.com/" style="color:rgb(17,85,204)" target="_blank"><font color="#888888">www.elementai.com</font></a></div></div></div></div>
</div>