<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">Strahil Nikolov:<br>
</div>
<blockquote type="cite"
cite="mid:2082346144.1499058.1691692399062@mail.yahoo.com">
I’ve never had such situation and I don’t recall someone sharing
something similar.</blockquote>
<br>
<br>
That's strange, it is really easy to reproduce. This is from a fresh
test environment:<br>
<br>
summary:<br>
- There is one snapshot present. <br>
- On one node glusterd is stopped. <br>
- During the stop, one snapshot is deleted. <br>
- The node is brought up again<br>
- On that node there is an orphaned snapshot<br>
<br>
<br>
detailed version:<br>
# on node 1:<br>
root@gl1:~# cat /etc/debian_version<br>
11.7<br>
<br>
root@gl1:~# gluster --version<br>
glusterfs 10.4<br>
<br>
root@gl1:~# gluster volume info<br>
Volume Name: glvol_samba<br>
Type: Replicate<br>
Volume ID: 91cb059e-10e4-4439-92ea-001065652749<br>
Status: Started<br>
Snapshot Count: 1<br>
Number of Bricks: 1 x 3 = 3<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1: gl1:/data/glusterfs/glvol_samba/brick0/brick<br>
Brick2: gl2:/data/glusterfs/glvol_samba/brick0/brick<br>
Brick3: gl3:/data/glusterfs/glvol_samba/brick0/brick<br>
Options Reconfigured:<br>
cluster.granular-entry-heal: on<br>
storage.fips-mode-rchecksum: on<br>
transport.address-family: inet<br>
nfs.disable: on<br>
performance.client-io-threads: off<br>
features.barrier: disable<br>
<br>
root@gl1:~# gluster snapshot list<br>
snaps_GMT-2023.08.15-13.05.28<br>
<br>
<br>
<br>
# on node 3:<br>
root@gl3:~# systemctl stop glusterd.service<br>
<br>
<br>
<br>
# on node 1:<br>
root@gl1:~# gluster snapshot deactivate
snaps_GMT-2023.08.15-13.05.28<br>
Deactivating snap will make its data inaccessible. Do you want to
continue? (y/n) y<br>
Snapshot deactivate: snaps_GMT-2023.08.15-13.05.28: Snap deactivated
successfully<br>
<br>
root@gl1:~# gluster snapshot delete snaps_GMT-2023.08.15-13.05.28<br>
Deleting snap will erase all the information about the snap. Do you
still want to continue? (y/n) y<br>
snapshot delete: snaps_GMT-2023.08.15-13.05.28: snap removed
successfully<br>
<br>
root@gl1:~# gluster snapshot list<br>
No snapshots present<br>
<br>
<br>
<br>
# on node 3:<br>
root@gl3:~# systemctl start glusterd.service<br>
<br>
root@gl3:~# gluster snapshot list<br>
snaps_GMT-2023.08.15-13.05.28<br>
<br>
root@gl3:~# gluster snapshot deactivate
snaps_GMT-2023.08.15-13.05.28<br>
Deactivating snap will make its data inaccessible. Do you want to
continue? (y/n) y<br>
snapshot deactivate: failed: Pre Validation failed on gl1.ad.arc.de.
Snapshot (snaps_GMT-2023.08.15-13.05.28) does not exist.<br>
Pre Validation failed on gl2. Snapshot
(snaps_GMT-2023.08.15-13.05.28) does not exist.<br>
Snapshot command failed<br>
<br>
root@gl3:~# lvs -a<br>
LV VG Attr LSize
Pool Origin Data% Meta% Move Log Cpy%Sync Convert<br>
669cbc14fa7542acafb2995666284583_0 vg_brick0 Vwi-aotz-- 15,00g
tp_brick0 lv_brick0 0,08<br>
lv_brick0 vg_brick0 Vwi-aotz-- 15,00g
tp_brick0 0,08<br>
[lvol0_pmspare] vg_brick0 ewi------- 20,00m<br>
tp_brick0 vg_brick0 twi-aotz--
18,00g 0,12 10,57<br>
[tp_brick0_tdata] vg_brick0 Twi-ao---- 18,00g<br>
[tp_brick0_tmeta] vg_brick0 ewi-ao---- 20,00m<br>
<br>
<br>
<br>
<br>
Would it be dangerous to just delete following items on node 3 while
gluster is down:<br>
- the orphaned directories in /var/lib/glusterd/snaps/<br>
- the orphaned lvm, here 669cbc14fa7542acafb2995666284583_0 <br>
<br>
Or is there a self-heal command?<br>
<br>
Regards<br>
Sebastian<br>
<br>
<div class="moz-cite-prefix">Am 10.08.2023 um 20:33 schrieb Strahil
Nikolov:<br>
</div>
<blockquote type="cite"
cite="mid:2082346144.1499058.1691692399062@mail.yahoo.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<!--[if gte mso 9]><xml><o:OfficeDocumentSettings><o:AllowPNG/><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]-->
I’ve never had such situation and I don’t recall someone sharing
something similar.
<div><span style="-webkit-text-size-adjust: auto;"><br>
</span></div>
<div><span style="-webkit-text-size-adjust: auto;">Most probably
it’s easier to remove the node from the TSP and re-add it.</span></div>
<div>
<div>Of course , test the case in VMs just to validate that it’s
possible to add a mode to a cluster with snapshots.</div>
<div><br>
</div>
<div>I have a vague feeling that you will need to delete all
snapshots.</div>
<div>
<div><br>
</div>
<div>Best Regards,</div>
<div>Strahil Nikolov </div>
<div><br>
<p class="yahoo-quoted-begin"
style="font-size: 15px; color: #715FFA; padding-top: 15px; margin-top: 0">On
Thursday, August 10, 2023, 4:36 AM, Sebastian Neustein
<a class="moz-txt-link-rfc2396E" href="mailto:sebastian.neustein@arc-aachen.de"><sebastian.neustein@arc-aachen.de></a> wrote:</p>
<blockquote class="iosymail">
<div id="yiv5705540302">
<div>
<div
class="yiv5705540302c-message_kit__blocks yiv5705540302c-message_kit__blocks--rich_text">
<div
class="yiv5705540302c-message__message_blocks yiv5705540302c-message__message_blocks--rich_text">
<div class="yiv5705540302p-block_kit_renderer">
<div
class="yiv5705540302p-block_kit_renderer__block_wrapper yiv5705540302p-block_kit_renderer__block_wrapper--first">
<div class="yiv5705540302p-rich_text_block">
<div
class="yiv5705540302p-rich_text_section">Hi<br>
<br>
Due to an outage of one node, after
bringing it up again, the node has some
orphaned snapshosts, which are already
deleted on the other nodes. <br>
<br>
<span class="yiv5705540302c-mrkdwn__br"></span>How
can I delete these orphaned snapshots?
Trying the normal way produceses these
errors:<br>
<code class="yiv5705540302c-mrkdwn__code">[2023-08-08
19:34:03.667109 +0000] E [MSGID: 106115]
[glusterd-mgmt.c:118:gd_mgmt_v3_collate_errors] 0-management: Pre
Validation failed on B742. Please check
log file for details.</code><br>
<code class="yiv5705540302c-mrkdwn__code">[2023-08-08
19:34:03.667184 +0000] E [MSGID: 106115]
[glusterd-mgmt.c:118:gd_mgmt_v3_collate_errors] 0-management: Pre
Validation failed on B741. Please check
log file for details.</code><br>
<code class="yiv5705540302c-mrkdwn__code">[2023-08-08
19:34:03.667210 +0000] E [MSGID: 106121]
[glusterd-mgmt.c:1083:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed on peers</code><br>
<code class="yiv5705540302c-mrkdwn__code">[2023-08-08
19:34:03.667236 +0000] E [MSGID: 106121]
[glusterd-mgmt.c:2875:glusterd_mgmt_v3_initiate_snap_phases]
0-management: Pre Validation Failed</code><span
class="yiv5705540302c-mrkdwn__br"></span><br>
<br>
Even worse: I followed <a
rel="nofollow noopener noreferrer"
target="_blank"
href="https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/troubleshooting_snapshots"
class="yiv5705540302c-link"
moz-do-not-send="true">read hat gluster
snapshot trouble guide</a> and deleted
one of those directories defining a
snapshot. Now I receive this on the cli:<br>
<code class="yiv5705540302c-mrkdwn__code">run-gluster-snaps-e4dcd4166538414c849fa91b0b3934d7-brick6-brick[297342]:
[2023-08-09 08:59:41.107243 +0000] M
[MSGID: 113075]
[posix-helpers.c:2161:posix_health_check_thread_proc]
0-e4dcd4166538414c849fa91b0b3934d7-posix: health-check failed, going
down</code><br>
<code class="yiv5705540302c-mrkdwn__code">run-gluster-snaps-e4dcd4166538414c849fa91b0b3934d7-brick6-brick[297342]:
[2023-08-09 08:59:41.107243 +0000] M
[MSGID: 113075]
[posix-helpers.c:2161:posix_health_check_thread_proc]
0-e4dcd4166538414c849fa91b0b3934d7-posix: health-check failed, going
down</code><br>
<code class="yiv5705540302c-mrkdwn__code">run-gluster-snaps-e4dcd4166538414c849fa91b0b3934d7-brick6-brick[297342]:
[2023-08-09 08:59:41.107292 +0000] M
[MSGID: 113075]
[posix-helpers.c:2179:posix_health_check_thread_proc]
0-e4dcd4166538414c849fa91b0b3934d7-posix: still alive! -> SIGTERM</code><br>
<code class="yiv5705540302c-mrkdwn__code">run-gluster-snaps-e4dcd4166538414c849fa91b0b3934d7-brick6-brick[297342]:
[2023-08-09 08:59:41.107292 +0000] M
[MSGID: 113075]
[posix-helpers.c:2179:posix_health_check_thread_proc]
0-e4dcd4166538414c849fa91b0b3934d7-posix: still alive! -> SIGTERM</code><span
class="yiv5705540302c-mrkdwn__br"></span><br>
<br>
What are my options? <br>
- is there an easy way to remove all those
snapshots?<br>
- or would it be easier to remove and
rejoin the node to the gluster cluster?<br>
<br>
Thank you for any help!<br>
<br>
Seb<br>
<span dir="ltr"
class="yiv5705540302c-message__edited_label"></span></div>
</div>
</div>
</div>
</div>
</div>
<pre class="yiv5705540302moz-signature">--
Sebastian Neustein
Airport Research Center GmbH
Bismarckstraße 61
52066 Aachen
Germany
Phone: +49 241 16843-23
Fax: +49 241 16843-19
e-mail: <a rel="nofollow noopener noreferrer"
ymailto="mailto:sebastian.neustein@arc-aachen.de"
target="_blank"
href="mailto:sebastian.neustein@arc-aachen.de"
class="yiv5705540302moz-txt-link-abbreviated moz-txt-link-freetext"
moz-do-not-send="true">sebastian.neustein@arc-aachen.de</a>
Website: <a rel="nofollow noopener noreferrer" target="_blank"
href="http://www.airport-consultants.com"
class="yiv5705540302moz-txt-link-freetext moz-txt-link-freetext"
moz-do-not-send="true">http://www.airport-consultants.com</a>
Register Court: Amtsgericht Aachen HRB 7313
Ust-Id-No.: DE196450052
Managing Director:
Dipl.-Ing. Tom Alexander Heuer</pre>
</div>
</div>
________<br>
<br>
<br>
<br>
Community Meeting Calendar:<br>
<br>
Schedule -<br>
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br>
Bridge: <a href="https://meet.google.com/cpu-eiue-hvk"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://meet.google.com/cpu-eiue-hvk</a><br>
Gluster-users mailing list<br>
<a ymailto="mailto:Gluster-users@gluster.org"
href="mailto:Gluster-users@gluster.org"
moz-do-not-send="true" class="moz-txt-link-freetext">Gluster-users@gluster.org</a><br>
<a
href="https://lists.gluster.org/mailman/listinfo/gluster-users"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>