<div>I would start by reading the 3 blogs from Ravi:</div><div><a id="linkextractor__1662185803164" data-yahoo-extracted-link="true" href="https://ravispeaks.wordpress.com/2019/04/05/glusterfs-afr-the-complete-guide/">https://ravispeaks.wordpress.com/2019/04/05/glusterfs-afr-the-complete-guide/</a></div><div><a id="linkextractor__1662185805433" data-yahoo-extracted-link="true" href="https://ravispeaks.wordpress.com/2019/04/15/gluster-afr-the-complete-guide-part-2/">https://ravispeaks.wordpress.com/2019/04/15/gluster-afr-the-complete-guide-part-2/</a></div><div><a id="linkextractor__1662185822140" data-yahoo-extracted-link="true" href="https://ravispeaks.wordpress.com/2019/05/14/gluster-afr-the-complete-guide-part-3/">https://ravispeaks.wordpress.com/2019/05/14/gluster-afr-the-complete-guide-part-3/</a></div><div><br></div><div><br></div>All pending heals a hard links created in .glusterfs/indices/xattrop <em>.</em><div><i>Check the attributes of a the gfids there (one by one) for differences on the bricks and if they are the same - you can delete them from .glusterfs/indices/xattrop (the root entry must stay !!!). If not, the attributes can hint you what happened and which is the good copy.</i></div><div><i><br></i></div><div><i>Best Regards,</i></div><div><i>Strahil Nikolov </i></div><div><div> <blockquote style="margin: 0 0 20px 0;"> <div style="font-family:Roboto, sans-serif; color:#6D00F6;"> <div>On Wed, Aug 31, 2022 at 15:30, Ilias Chasapakis forumZFD</div><div><chasapakis@forumZFD.de> wrote:</div> </div> <div style="padding: 10px 0 0 20px; margin: 10px 0 0 0; border-left: 1px solid #6D00F6;"> <div id="yiv8110016555"><div>
<p>Hi all,</p>
<p>so we went further and deleted the entries (data and gfid). The
split brain is now gone, but when we triggered a heal again
(simple and full) we have many entries stuck in healing (no
split-brain items). They are there since days/weeks and still
appearing.</p>
<p>We would like to heal single files but as they are not in split
brain I guess this is not possible right? The "source-brick"
technique works only in that case I think?<br clear="none">
</p>
<p>A concrete example of one of that files that are stuck in the
healing queue: I checked the attributes with getfattr and saw that
one of the nodes does not have nor the data or the gfid. Missing
completely. How could I trigger a replication from the "good copy"
to the gluster node that does not have the file? Is it possible
for entries *not* in split brain? Doing a listing on the mount
side (ls) of the affected directory did not seem to trigger a
heal.<br clear="none">
</p>
<p>Also the shd logs have some ambiguous (for me) entries. The sink
value is empty, shouldn´t it be a number indicating it is healing?</p>
<p>
</p><blockquote type="cite">[2022-08-28 17:22:11.098604 +0000] I
[MSGID: 108026] [afr-self-heal-common.c:1742:afr_log_selfheal]
0-vol-replicate-0: Completed metadata selfheal on
94503c97-7731-4aa1-8a14-2c6ea5a84a15. sources=1 [2] sinks=
<br clear="none">
[2022-08-28 17:22:16.227091 +0000] I [MSGID: 108026]
[afr-self-heal-common.c:1742:afr_log_selfheal]
0-gv-ho-replicate-0: Completed metadata selfheal on
94503c97-7731-4aa1-8a14-2c6ea5a84a15. sources=1 [2] sinks=
</blockquote>
<p>I try to use the guide here:</p>
<p><a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://docs.gluster.org/en/main/Troubleshooting/troubleshooting-afr/#ii-self-heal-is-stuck-not-getting-completed" class="yiv8110016555moz-txt-link-freetext">https://docs.gluster.org/en/main/Troubleshooting/troubleshooting-afr/#ii-self-heal-is-stuck-not-getting-completed</a></p>
<p>but find difficult to apply.</p>
<p>Do you have any suggestions on how to "unblock" these stuck
entries and what is a methodic approach to troubleshooting this
situation?</p>
<p>Finally I would like to ask if the risk of updating the glusters
(we have pending updates now) would be too dangerous without
previously fixing the unhealed entries. Our hope is that an update
could eventually fix the problems.</p>
<p>Best regards.<br clear="none">
Ilias<br clear="none">
</p>
<p><br clear="none">
</p>
<div class="yiv8110016555moz-cite-prefix">Am 18.08.22 um 23:38 schrieb Strahil
Nikolov:<br clear="none">
</div>
<blockquote type="cite">
</blockquote></div><div>
If you refer to
/<path_to_brick>/.glusterfs/<gfid_first_2_characters>/<gfid_second_2_characters>/gfid
- it' s a hard link to the file on the brick.
<div>Directories in the .glusterfs are just symbolic links.</div>
<div><br clear="none">
</div>
<div>Can you clarify what you are planing to delete ?</div>
<div><br clear="none">
</div>
<div>Best Regards,</div>
<div>Strahil Nikolov </div>
<div> <br clear="none">
<blockquote style="margin:0 0 20px 0;">
<div style="font-family:Roboto, sans-serif;color:#6D00F6;">
<div>On Wed, Aug 17, 2022 at 14:35, Ilias Chasapakis
forumZFD</div>
<div><a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:chasapakis@forumZFD.de" target="_blank" href="mailto:chasapakis@forumZFD.de" class="yiv8110016555moz-txt-link-rfc2396E"><chasapakis@forumZFD.de></a> wrote:</div>
</div>
<div style="padding:10px 0 0 20px;margin:10px 0 0 0;border-left:1px solid #6D00F6;">
<div id="yiv8110016555">
<div>
<p>Hi Thomas,</p>
<p>Thanks again for your replies and patience :)</p>
<p>We have also offline backups of the files.<br clear="none">
</p>
<p>So, just to verify I understood this correctly,
deletion of a .glusterfs-gfid file doesn't inherently
include the risk of the loss of the complete brick,
right?</p>
<p>I saw you already applied this for your purposes so
it worked for you... But just as a confirmation. Of
course it is fully understood that the operational
risk is on our side.<br clear="none">
</p>
<p>It is just an "information-wise" question :)<br clear="none">
</p>
<p>Best regards<br clear="none">
Ilias<br clear="none">
</p>
<div class="yiv8110016555moz-cite-prefix">Am 17.08.22 um
12:47 schrieb Thomas Bätzler:<br clear="none">
</div>
<blockquote type="cite">
<style>#yiv8110016555 filtered {}#yiv8110016555 filtered {}#yiv8110016555 filtered {}#yiv8110016555 filtered {}#yiv8110016555 filtered {}#yiv8110016555 filtered {}#yiv8110016555 p.yiv8110016555MsoNormal, #yiv8110016555 li.yiv8110016555MsoNormal, #yiv8110016555 div.yiv8110016555MsoNormal
{margin:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv8110016555 a:link, #yiv8110016555 span.yiv8110016555MsoHyperlink
{color:blue;text-decoration:underline;}#yiv8110016555 pre
{margin:0cm;font-size:10.0pt;}#yiv8110016555 span.yiv8110016555HTMLVorformatiertZchn
{font-family:serif;}#yiv8110016555 span.yiv8110016555E-MailFormatvorlage21
{font-family:sans-serif;color:windowtext;}#yiv8110016555 .yiv8110016555MsoChpDefault
{font-size:10.0pt;}#yiv8110016555 div.yiv8110016555WordSection1
{}</style>
<div class="yiv8110016555WordSection1">
<p class="yiv8110016555MsoNormal"><span lang="EN-US">Hello
Ilias,<br clear="none">
<br clear="none">
Please note that you can and should backup all
of the file(s) involved in the split-brain by
accessing them over the brick root instead of
the gluster mount. That is also the reason why
you’re not in danger of a failure cascade wiping
out our data.</span></p>
<p class="yiv8110016555MsoNormal"><span lang="EN-US">
</span></p>
<p class="yiv8110016555MsoNormal"><span lang="EN-US">Be
careful when replacing bricks, though. You want
that heal to go in the right direction </span><span lang="EN-US" style="font-family:UI sans-serif;">😉</span><span lang="EN-US"></span></p>
<p class="yiv8110016555MsoNormal"><span lang="EN-US">
</span></p>
<div>
<p class="yiv8110016555MsoNormal">Mit freundlichen
Grüßen,</p>
<p class="yiv8110016555MsoNormal">i.A. Thomas
Bätzler</p>
<p class="yiv8110016555MsoNormal">-- </p>
<p class="yiv8110016555MsoNormal">BRINGE
Informationstechnik GmbH</p>
<p class="yiv8110016555MsoNormal">Zur Seeplatte 12</p>
<p class="yiv8110016555MsoNormal">D-76228
Karlsruhe</p>
<p class="yiv8110016555MsoNormal">Germany</p>
<p class="yiv8110016555MsoNormal"> </p>
<p class="yiv8110016555MsoNormal">Fon: +49 721
94246-0</p>
<p class="yiv8110016555MsoNormal">Fon: +49 171
5438457</p>
<p class="yiv8110016555MsoNormal">Fax: +49 721
94246-66</p>
<p class="yiv8110016555MsoNormal">Web: <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="http://www.bringe.de/"><span style="color:#0563C1;">http://www.bringe.de/</span></a></p>
<p class="yiv8110016555MsoNormal"> </p>
<p class="yiv8110016555MsoNormal">Geschäftsführer:
Dipl.-Ing. (FH) Martin Bringe</p>
<p class="yiv8110016555MsoNormal">Ust.Id:
DE812936645, HRB 108943 Mannheim</p>
</div>
<p class="yiv8110016555MsoNormal"> </p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm;">
<p class="yiv8110016555MsoNormal"><b>Von:</b>
Gluster-users <a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:gluster-users-bounces@gluster.org" target="_blank" href="mailto:gluster-users-bounces@gluster.org" class="yiv8110016555moz-txt-link-rfc2396E"><gluster-users-bounces@gluster.org></a>
<b>Im Auftrag von </b>Ilias Chasapakis
forumZFD<br clear="none">
<b>Gesendet:</b> Mittwoch, 17. August 2022
11:18<br clear="none">
<b>An:</b> <a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:gluster-users@gluster.org" target="_blank" href="mailto:gluster-users@gluster.org" class="yiv8110016555moz-txt-link-abbreviated yiv8110016555moz-txt-link-freetext">gluster-users@gluster.org</a><br clear="none">
<b>Betreff:</b> Re: [Gluster-users] Directory
in split brain does not heal - Gfs 9.2</p>
</div>
</div>
<p class="yiv8110016555MsoNormal"> </p>
<p>Thanks for the suggestions. My question is if the
risk is actually related to only losing the
file/dir or actually creating inconsistencies that
span through the bricks and "break everything".<br clear="none">
Of course we have to take action anyway for this
not to spread (as we already now have a second
entry that developed an "unhealable" directory
split-brain) so it is just a question of
evaluation before acting.</p>
<div>
<p class="yiv8110016555MsoNormal">Am 12.08.22 um
18:12 schrieb Thomas Bätzler:</p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt;">
<div>
<p class="yiv8110016555MsoNormal">Am 12.08.2022
um 17:12 schrieb Ilias Chasapakis forumZFD:</p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt;">
<p>Dear fellow gluster users,</p>
<p>we are facing a problem with our replica 3
setup. Glusterfs version is 9.2.</p>
<p>We have a problem with a directory that is in
split-brain and we cannot manage to heal with:</p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt;">
<p>gluster volume heal gfsVol split-brain
latest-mtime /folder</p>
</blockquote>
<p>The command throws the following error:
"failed:Transport endpoint is not connected."
</p>
<p>So the split brain directory entry remains
and and so the whole healing process is not
completing and other entries get stuck.</p>
<p>I saw there is a python script available <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://github.com/joejulian/glusterfs-splitbrain" class="yiv8110016555moz-txt-link-freetext yiv8110016555moz-txt-link-freetext">https://github.com/joejulian/glusterfs-splitbrain</a>
Would that be a good solution to try? To be
honest we are a bit concerned with deleting
the gfid and the files from the brick manually
as it seems it can create inconsistencies and
break things... I can of course give you more
information about our setup and situation, but
if you already have some tip, that would be
fantastic.</p>
</blockquote>
<p>You could at least verify what's going on: Go
to your brick roots and list /folder from each.
You have 3n bricks with n replica sets. Find the
replica set where you can spot a difference.
It's most likely a file or directory that's
missing or different. If it's a file, do a ls
-ain on the file on each brick in the replica
set. It'll report an inode number. Do a find
.glusterfs -inum from the brick root. You'll
likely see that you have different gfid-files.</p>
<p>To fix the problem, you have to help gluster
along by cleaning up the mess. This is
completely "do it at your own risk, it worked
for me, ymmv": cp (not mv!) a copy of the file
you want to keep. On each brick in the
replica-set, delete the gfid-file and the
datafile. Try a heal on the volume and verify
that you can access the path in question using
the glusterfs mount. Copy back your salvaged
file using the glusterfs mount.</p>
<p>We had this happening quite often on a heavily
loaded glusterfs shared filesystem that held a
mail-spool. There would be parallel accesses
trying to mv files and sometimes we'd end up
with mismatched data on the bricks of the
replica set. I've reported this on github, but
apparently it wasn't seen as a serious problem.
We've moved on to ceph FS now. That sure has
bugs, too, but hopefully not as aggravating.</p>
<pre>MfG,</pre>
<pre>i.A. Thomas Bätzler</pre>
<pre>-- </pre>
<pre>BRINGE Informationstechnik GmbH</pre>
<pre>Zur Seeplatte 12</pre>
<pre>D-76228 Karlsruhe</pre>
<pre>Germany</pre>
<pre> </pre>
<pre>Fon: +49 721 94246-0</pre>
<pre>Fon: +49 171 5438457</pre>
<pre>Fax: +49 721 94246-66</pre>
<pre>Web: <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="http://www.bringe.de/" class="yiv8110016555moz-txt-link-freetext yiv8110016555moz-txt-link-freetext">http://www.bringe.de/</a></pre>
<pre> </pre>
<pre>Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe</pre>
<pre>Ust.Id: DE812936645, HRB 108943 Mannheim</pre>
<p class="yiv8110016555MsoNormal"><br clear="none">
<br clear="none">
</p>
<pre>________</pre>
<pre> </pre>
<pre> </pre>
<pre> </pre>
<pre>Community Meeting Calendar:</pre>
<pre> </pre>
<pre>Schedule -</pre>
<pre>Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC</pre>
<pre>Bridge: <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://meet.google.com/cpu-eiue-hvk" class="yiv8110016555moz-txt-link-freetext yiv8110016555moz-txt-link-freetext">https://meet.google.com/cpu-eiue-hvk</a></pre>
<pre>Gluster-users mailing list</pre>
<pre><a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:Gluster-users@gluster.org" target="_blank" href="mailto:Gluster-users@gluster.org" class="yiv8110016555moz-txt-link-freetext yiv8110016555moz-txt-link-freetext">Gluster-users@gluster.org</a></pre>
<pre><a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://lists.gluster.org/mailman/listinfo/gluster-users" class="yiv8110016555moz-txt-link-freetext yiv8110016555moz-txt-link-freetext">https://lists.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<pre>-- </pre>
<pre><span style="font-family:sans-serif;"></span>forumZFD</pre>
<pre>Entschieden für Frieden | Committed to Peace</pre>
<pre> </pre>
<pre>Ilias Chasapakis</pre>
<pre>Referent IT | IT Consultant</pre>
<pre> </pre>
<pre>Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service</pre>
<pre>Am Kölner Brett 8 | 50825 Köln | Germany</pre>
<pre> </pre>
<pre>Tel 0221 91273243 | Fax 0221 91273299 | <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="http://www.forumZFD.de">http://www.forumZFD.de</a></pre>
<pre> </pre>
<pre>Vorstand nach § 26 BGB, einzelvertretungsberechtigt | Executive Board:</pre>
<pre>Oliver Knabe (Vorsitz | Chair), Jens von Bargen, Alexander Mauz</pre>
<pre>VR 17651 Amtsgericht Köln</pre>
<pre> </pre>
<pre>Spenden | Donations: IBAN DE37 3702 0500 0008 2401 01 BIC BFSWDE33XXX</pre>
</div>
</blockquote>
<pre class="yiv8110016555moz-signature">--
forumZFD
Entschieden für Frieden | Committed to Peace
Ilias Chasapakis
Referent IT | IT Consultant
Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service
Am Kölner Brett 8 | 50825 Köln | Germany
Tel 0221 91273243 | Fax 0221 91273299 | <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="http://www.forumZFD.de" class="yiv8110016555moz-txt-link-freetext">http://www.forumZFD.de</a>
Vorstand nach § 26 BGB, einzelvertretungsberechtigt | Executive Board:
Oliver Knabe (Vorsitz | Chair), Jens von Bargen, Alexander Mauz
VR 17651 Amtsgericht Köln
Spenden | Donations: IBAN DE37 3702 0500 0008 2401 01 BIC BFSWDE33XXX</pre>
</div><div id="yiv8110016555yqtfd64951" class="yiv8110016555yqt9575786435">
</div></div><div id="yiv8110016555yqtfd44419" class="yiv8110016555yqt9575786435">
________<br clear="none">
<br clear="none">
<br clear="none">
<br clear="none">
Community Meeting Calendar:<br clear="none">
<br clear="none">
Schedule -<br clear="none">
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br clear="none">
Bridge: <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://meet.google.com/cpu-eiue-hvk" class="yiv8110016555moz-txt-link-freetext">https://meet.google.com/cpu-eiue-hvk</a><br clear="none">
Gluster-users mailing list<br clear="none">
<a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:Gluster-users@gluster.org" target="_blank" href="mailto:Gluster-users@gluster.org" class="yiv8110016555moz-txt-link-freetext">Gluster-users@gluster.org</a><br clear="none">
<a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://lists.gluster.org/mailman/listinfo/gluster-users" class="yiv8110016555moz-txt-link-freetext">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br clear="none">
</div></div><div id="yiv8110016555yqtfd15573" class="yiv8110016555yqt9575786435">
</div></blockquote><div id="yiv8110016555yqtfd40299" class="yiv8110016555yqt9575786435">
</div></div><div id="yiv8110016555yqtfd26510" class="yiv8110016555yqt9575786435">
<pre class="yiv8110016555moz-signature">--
forumZFD
Entschieden für Frieden | Committed to Peace
Ilias Chasapakis
Referent IT | IT Consultant
Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service
Am Kölner Brett 8 | 50825 Köln | Germany
Tel 0221 91273243 | Fax 0221 91273299 | <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="http://www.forumZFD.de" class="yiv8110016555moz-txt-link-freetext">http://www.forumZFD.de</a>
Vorstand nach § 26 BGB, einzelvertretungsberechtigt | Executive Board:
Oliver Knabe (Vorsitz | Chair), Jens von Bargen, Alexander Mauz
VR 17651 Amtsgericht Köln
Spenden | Donations: IBAN DE37 3702 0500 0008 2401 01 BIC BFSWDE33XXX</pre>
</div></div></div> </div> </blockquote></div></div>