Now I would recommend you to recheck the brick layout  in such case that a server failure won't "kill" the subvolume.<div id="yMail_cursorElementTracker_1627917301763"><br></div><div id="yMail_cursorElementTracker_1627917301909">For example, if your volume (replica 3 or replica 3 arbiter 1) has brick layout like:</div><div id="yMail_cursorElementTracker_1627917352178">serverA:/brick/brick1</div><div id="yMail_cursorElementTracker_1627917364158">serverB:/brick/brick1</div><div id="yMail_cursorElementTracker_1627917373644">serverC:/brick/brick1</div><div id="yMail_cursorElementTracker_1627917383594">serverD:/brick/brick2</div><div id="yMail_cursorElementTracker_1627917392940">serverA:/brick/brick2</div><div id="yMail_cursorElementTracker_1627917405435">serverB:/brick/brick2</div><div id="yMail_cursorElementTracker_1627917423479"><br></div><div id="yMail_cursorElementTracker_1627917432227">the first subvolume consists of the 'brick1' (first 3 entries of the output), while 'brick2' is your second subvolume.</div><div id="yMail_cursorElementTracker_1627917507127"><br></div><div id="yMail_cursorElementTracker_1627917507385">Each file (or a shard, when it's enabled) will be located in a single subvolume and if 2 (out of those 3) entries are on a same host -> host failure will lead to unavailability of some of the files (or shards) untill the quorum is restored ( 2 out of 3 bricks).</div><div id="yMail_cursorElementTracker_1627917602725"><br></div><div id="yMail_cursorElementTracker_1627917602996">Best Regards,</div><div id="yMail_cursorElementTracker_1627917615949">Strahil Nikolov</div><div id="yMail_cursorElementTracker_1627917424016"><br><div id="yMail_cursorElementTracker_1627917299808"> <br> <blockquote style="margin: 0 0 20px 0;"> <div style="font-family:Roboto, sans-serif; color:#6D00F6;"> <div>On Mon, Aug 2, 2021 at 18:11, Valerio Luccio</div><div><valerio.luccio@nyu.edu> wrote:</div> </div> <div style="padding: 10px 0 0 20px; margin: 10px 0 0 0; border-left: 1px solid #6D00F6;"> <div id="yiv1321384791"><div>
    <p>Thanks Strahil,</p>
    <p>it worked like a charm.<br clear="none">
    </p>
    <p>On 7/31/21 5:21 PM, Strahil Nikolov wrote:
      </p><blockquote type="cite">
        <pre class="yiv1321384791moz-quote-pre">You most probably already have an /etc/fstab entry, so just recreate the LV, recreate the FS (mkfs.xfs -i size=512 /path/to/lv) and mount it.


For source brick and new brick , just use 'hydra4:/gluster1/data' .

Don't forget to test on non-prod first ;)


Best Regards,
Strahil Nikolov





В събота, 31 юли 2021 г., 19:50:31 ч. Гринуич+3, Valerio Luccio <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-rfc2396E" ymailto="mailto:valerio.luccio@nyu.edu" target="_blank" href="mailto:valerio.luccio@nyu.edu"><valerio.luccio@nyu.edu></a> написа: 






Thanks Strahil,

I have a couple of questions. 


My /gluster1 is mounted, but no "data" folder, I suppose that will be created when I reset the brick. I see two versions of the reset-brick operations:

</pre>
        <blockquote type="cite">
          <pre class="yiv1321384791moz-quote-pre">  gluster reset-brick <VOLNAME> <SOURCE-BRICK> start
gluster reset-brick <VOLNAME> <SOURCE-BRICK> <NEW-BRICK> commit

</pre>
        </blockquote>
        <pre class="yiv1321384791moz-quote-pre">I assume I should use the first version. When I specify the <SOURCE-BRICK> do I just put the number or do I have to include the "Brick" prefix ?

As for the brick layout, you hit the nail on the head. My VMs' went down and it was very painful to get everything back up to speed. I thought I had implemented the correct layout, I guess I was wrong. Which one do you suggest ?

Thanks so much,


On 7/31/21 10:54 AM, Strahil Nikolov wrote:


</pre>
        <blockquote type="cite">
          <pre class="yiv1321384791moz-quote-pre">  
</pre>
        </blockquote>
        <pre class="yiv1321384791moz-quote-pre">Yep, you have to bring back hydra4:/gluster1/data 



You can mount again /gluster1/data (don't forget SELINUX) and then use gluster's reset-brick to rebuild that brick.




Most probably you have an entry that was saved on hydra4:/gluster1/data and the arbiter but was not pushed to the surviving brick (based on the entry in the arbiter)




Usually, I use method 2 from <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-freetext" target="_blank" href="https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.gluster.org_en_latest_Troubleshooting_gfid-2Dto-2Dpath_&d=DwIFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=zZK0dca4HNf-XwnAN9ais1C3ncS0n2x39pF7yr-muHY&m=FMO4OcFZ-117zmwVGZre6m7D-QHTvX_5dvOJrLbalMo&s=WMkvh31SMb6jhuAJ-V5bC1pOtZX6EuVm-ARdKYwg3TE&e=">https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.gluster.org_en_latest_Troubleshooting_gfid-2Dto-2Dpath_&d=DwIFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=zZK0dca4HNf-XwnAN9ais1C3ncS0n2x39pF7yr-muHY&m=FMO4OcFZ-117zmwVGZre6m7D-QHTvX_5dvOJrLbalMo&s=WMkvh31SMb6jhuAJ-V5bC1pOtZX6EuVm-ARdKYwg3TE&e=</a>  to identify the file and then you can check the status of that file on the bricks (and if necessary restore from backup). You should do that only after you have brought back hydra4:/gluster1/data to the volume.




You should also reconsider the brick layout, as in some cases a single hydra failure (whole server) will kill a subvolume and all VMs' disks on that subvolume will be unavailable.







Best Regards,

Strahil Nikolov


</pre>
        <blockquote type="cite">
          <pre class="yiv1321384791moz-quote-pre">  
  
On Sat, Jul 31, 2021 at 2:18, Valerio Luccio

<a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-rfc2396E" ymailto="mailto:valerio.luccio@nyu.edu" target="_blank" href="mailto:valerio.luccio@nyu.edu"><valerio.luccio@nyu.edu></a> wrote:


  
  
  
Strahil,

did some more digging into the heal info output.

Found also the following:

</pre>
          <blockquote type="cite">
            <pre class="yiv1321384791moz-quote-pre">  [...]
Brick hydra4:/gluster1/data                                                                                                                                                           
Status: Transport endpoint is not connected                                                                                                                                           
Number of entries: -                                                                                                                                                                  
                                                                                                                                                                                       
Brick hydra3:/arbiter/2                                                                                                                                                               
<gfid:2aa223b0-77f5-441e-bc76-34c8d459eeaa> 
[...]

</pre>
          </blockquote>
          <pre class="yiv1321384791moz-quote-pre">So, to augment what I wrote before, the errors appear in "Brick hydra3:/gluster3/data" and "Brick hydra3:/arbiter/2", plus that "Transport endpoint is not connected" for "Brick hydra4:/gluster1/data".

I need to add that hydra4:/gluster1 is the RAID that had major hardware failure.




  
-- 

As a result of Coronavirus-related precautions, NYU and the Center for Brain Imaging operations will be managed remotely until further notice. 
All telephone calls and e-mail correspondence are being monitored remotely during our normal business hours of 9am-5pm, Monday through Friday. 
  
For MRI scanner-related emergency, please contact: Keith Sanzenbach at  <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:keith.sanzenbach@nyu.edu" target="_blank" href="mailto:keith.sanzenbach@nyu.edu">keith.sanzenbach@nyu.edu</a>  and/or Pablo Velasco at  <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:pablo.velasco@nyu.edu" target="_blank" href="mailto:pablo.velasco@nyu.edu">pablo.velasco@nyu.edu</a> 
For computer/hardware/software emergency, please contact: Valerio Luccio at  <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:valerio.luccio@nyu.edu" target="_blank" href="mailto:valerio.luccio@nyu.edu">valerio.luccio@nyu.edu</a> 
For TMS/EEG-related emergency, please contact: Chrysa Papadaniil at  <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:chrysa@nyu.edu" target="_blank" href="mailto:chrysa@nyu.edu">chrysa@nyu.edu</a> 
For CBI-related administrative emergency, please contact: Jennifer Mangan at  <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:jennifer.mangan@nyu.edu" target="_blank" href="mailto:jennifer.mangan@nyu.edu">jennifer.mangan@nyu.edu</a> 

  

Valerio Luccio     (212) 998-8736 
Center for Brain Imaging     4 Washington Place, Room 158 
New York University     New York, NY 10003 

  

</pre>
          <blockquote type="cite">
            <pre class="yiv1321384791moz-quote-pre">"In an open world, who needs windows or gates ?"
</pre>
          </blockquote>
          <pre class="yiv1321384791moz-quote-pre">  
  
  
  
  
</pre>
        </blockquote>
        <pre class="yiv1321384791moz-quote-pre">



</pre>
      </blockquote>
    
    <div class="yiv1321384791yqt9824108229" id="yiv1321384791yqtfd07585"><p><br clear="none">
    </p>
    <div class="yiv1321384791moz-signature">-- <br clear="none">
      <table cellspacing="0" cellpadding="0"><tbody><tr><td colspan="1" rowspan="1">As a result of Coronavirus-related precautions, NYU and
              the Center for Brain Imaging operations will be managed
              remotely until further notice.</td></tr><tr><td colspan="1" rowspan="1">All telephone calls and e-mail correspondence are being
              monitored remotely during our normal business hours of
              9am-5pm, Monday through Friday.</td></tr><tr><td colspan="1" rowspan="1"> </td></tr><tr><td colspan="1" rowspan="1">For MRI scanner-related emergency, please contact: Keith
              Sanzenbach at <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:keith.sanzenbach@nyu.edu" target="_blank" href="mailto:keith.sanzenbach@nyu.edu">keith.sanzenbach@nyu.edu</a> and/or Pablo
              Velasco at <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:pablo.velasco@nyu.edu" target="_blank" href="mailto:pablo.velasco@nyu.edu">pablo.velasco@nyu.edu</a></td></tr><tr><td colspan="1" rowspan="1">For computer/hardware/software emergency, please
              contact: Valerio Luccio at <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:valerio.luccio@nyu.edu" target="_blank" href="mailto:valerio.luccio@nyu.edu">valerio.luccio@nyu.edu</a></td></tr><tr><td colspan="1" rowspan="1">For TMS/EEG-related emergency, please contact: Chrysa
              Papadaniil at <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:chrysa@nyu.edu" target="_blank" href="mailto:chrysa@nyu.edu">chrysa@nyu.edu</a></td></tr><tr><td colspan="1" rowspan="1">For CBI-related administrative emergency, please
              contact: Jennifer Mangan at <a rel="nofollow noopener noreferrer" shape="rect" class="yiv1321384791moz-txt-link-abbreviated" ymailto="mailto:jennifer.mangan@nyu.edu" target="_blank" href="mailto:jennifer.mangan@nyu.edu">jennifer.mangan@nyu.edu</a></td></tr></tbody></table>
      <p>
      </p>
      <table style="color:gray;" cellspacing="0" cellpadding="0"><tbody><tr><td colspan="1" rowspan="1">Valerio Luccio</td><td colspan="1" rowspan="1">   </td><td colspan="1" rowspan="1">(212) 998-8736</td></tr><tr><td colspan="1" rowspan="1">Center for Brain Imaging</td><td colspan="1" rowspan="1">   </td><td colspan="1" rowspan="1">4 Washington Place, Room 158</td></tr><tr><td colspan="1" rowspan="1">New York University</td><td colspan="1" rowspan="1">   </td><td colspan="1" rowspan="1">New York, NY 10003</td></tr></tbody></table>
      <p>
      </p>
      <blockquote>"In an open world, who needs windows or gates ?"</blockquote>
    </div>
  </div></div></div> </div> </blockquote></div></div>