<div>                I'm not sure if the md5sum has to match , but at least the content should do.<br>In modern versions of GlusterFS the client side healing is disabled , but it's worth trying.<br>You will need to enable cluster.metadata-self-heal, cluster.data-self-heal and cluster.entry-self-heal and then create a small one-liner that identifies the names of the files/dirs from the volume heal ,so you can stat them through the FUSE.<br><br>Something like this:<br><br><br>for i in $(gluster volume heal <VOL> info | awk -F '<gfid:|>' '/gfid:/ {print $2}'); do find /PATH/TO/BRICK/ -samefile /PATH/TO/BRICK/.glusterfs/${i:0:2}/${i:2:2}/$i | awk '!/.glusterfs/ {gsub("/PATH/TO/BRICK", "stat /MY/FUSE/MOUNTPOINT", $0); print $0}' ; done<br><br>Then Just copy paste the output and you will trigger the client side heal only on the affected gfids.<br><br>Best Regards,<br>Strahil Nikolov            </div>            <div class="yahoo_quoted" style="margin:10px 0px 0px 0.8ex;border-left:1px solid #ccc;padding-left:1ex;">                        <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">                                <div>                    В понеделник, 6 февруари 2023 г., 10:19:02 ч. Гринуич+2, Diego Zuccato <diego.zuccato@unibo.it> написа:                </div>                <div><br></div>                <div><br></div>                <div><div dir="ltr">Ops... Reincluding the list that got excluded in my previous answer :(<br clear="none"><br clear="none">I generated md5sums of all files in vols/ on clustor02 and compared to <br clear="none">the other nodes (clustor00 and clustor01).<br clear="none">There are differences in volfiles (shouldn't it always be 1, since every <br clear="none">data brick is on its own fs? quorum bricks, OTOH, share a single <br clear="none">partition on SSD and should always be 15, but in both cases sometimes <br clear="none">it's 0).<br clear="none"><br clear="none">I nearly got a stroke when I saw diff output for 'info' files, but once <br clear="none">I sorted 'em their contents matched. Pfhew!<br clear="none"><br clear="none">Diego<br clear="none"><br clear="none">Il 03/02/2023 19:01, Strahil Nikolov ha scritto:<br clear="none">> This one doesn't look good:<br clear="none">> <br clear="none">> <br clear="none">> [2023-02-03 07:45:46.896924 +0000] E [MSGID: 114079]<br clear="none">> [client-handshake.c:1253:client_query_portmap] 0-cluster_data-client-48:<br clear="none">> remote-subvolume not set in volfile []<br clear="none">> <br clear="none">> <br clear="none">> Can you compare all vol files in /var/lib/glusterd/vols/ between the nodes ?<br clear="none">> I have the suspicioun that there is a vol file mismatch (maybe <br clear="none">> /var/lib/glusterd/vols/<VOLUME_NAME>/*-shd.vol).<br clear="none">> <br clear="none">> Best Regards,<br clear="none">> Strahil Nikolov<br clear="none">> <br clear="none">>     On Fri, Feb 3, 2023 at 12:20, Diego Zuccato<br clear="none">>     <<a shape="rect" ymailto="mailto:diego.zuccato@unibo.it" href="mailto:diego.zuccato@unibo.it">diego.zuccato@unibo.it</a>> wrote:<br clear="none">>     Can't see anything relevant in glfsheal log, just messages related to<br clear="none">>     the crash of one of the nodes (the one that had the mobo replaced... I<br clear="none">>     fear some on-disk structures could have been silently damaged by RAM<br clear="none">>     errors and that makes gluster processes crash, or it's just an issue<br clear="none">>     with enabling brick-multiplex).<br clear="none">>     -8<--<br clear="none">>     [2023-02-03 07:45:46.896924 +0000] E [MSGID: 114079]<br clear="none">>     [client-handshake.c:1253:client_query_portmap]<br clear="none">>     0-cluster_data-client-48:<br clear="none">>     remote-subvolume not set in volfile []<br clear="none">>     [2023-02-03 07:45:46.897282 +0000] E<br clear="none">>     [rpc-clnt.c:331:saved_frames_unwind] (--><br clear="none">>     /lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x195)[0x7fce0c867b95]<br clear="none">>     (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x72fc)[0x7fce0c0ca2fc] (--><br clear="none">>     /lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x109)[0x7fce0c0d2419]<br clear="none">>     (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x10308)[0x7fce0c0d3308] (--><br clear="none">>     /lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x26)[0x7fce0c0ce7e6]<br clear="none">>     ))))) 0-cluster_data-client-48: forced unwinding frame type(GF-DUMP)<br clear="none">>     op(NULL(2)) called at 2023-02-03 07:45:46.891054 +0000 (xid=0x13)<br clear="none">>     -8<--<br clear="none">> <br clear="none">>     Well, actually I *KNOW* the files outside .glusterfs have been deleted<br clear="none">>     (by me :) ). That's why I call those 'stale' gfids.<br clear="none">>     Affected entries under .glusterfs have usually link count = 1 =><br clear="none">>     nothing<br clear="none">>     'find' can find.<br clear="none">>     Since I already recovered those files (before deleting from bricks),<br clear="none">>     can<br clear="none">>     .glusterfs entries be deleted too or should I check something else?<br clear="none">>     Maybe I should create a script that finds all files/dirs (not symlinks,<br clear="none">>     IIUC) in .glusterfs on all bricks/arbiters and moves 'em to a temp dir?<br clear="none">> <br clear="none">>     Diego<br clear="none">> <br clear="none">>     Il 02/02/2023 23:35, Strahil Nikolov ha scritto:<br clear="none">>      > Any issues reported in /var/log/glusterfs/glfsheal-*.log ?<br clear="none">>      ><br clear="none">>      > The easiest way to identify the affected entries is to run:<br clear="none">>      > find /FULL/PATH/TO/BRICK/ -samefile<br clear="none">>      ><br clear="none">>     /FULL/PATH/TO/BRICK/.glusterfs/57/e4/57e428c7-6bed-4eb3-b9bd-02ca4c46657a<br clear="none">>      ><br clear="none">>      ><br clear="none">>      > Best Regards,<br clear="none">>      > Strahil Nikolov<br clear="none">>      ><br clear="none">>      ><br clear="none">>      > В вторник, 31 януари 2023 г., 11:58:24 ч. Гринуич+2, Diego Zuccato<br clear="none">>      > <<a shape="rect" ymailto="mailto:diego.zuccato@unibo.it" href="mailto:diego.zuccato@unibo.it">diego.zuccato@unibo.it</a> <mailto:<a shape="rect" ymailto="mailto:diego.zuccato@unibo.it" href="mailto:diego.zuccato@unibo.it">diego.zuccato@unibo.it</a>>> написа:<br clear="none">>      ><br clear="none">>      ><br clear="none">>      > Hello all.<br clear="none">>      ><br clear="none">>      > I've had one of the 3 nodes serving a "replica 3 arbiter 1" down for<br clear="none">>      > some days (apparently RAM issues, but actually failing mobo).<br clear="none">>      > The other nodes have had some issues (RAM exhaustion, old problem<br clear="none">>      > already ticketed but still no solution) and some brick processes<br clear="none">>      > coredumped. Restarting the processes allowed the cluster to continue<br clear="none">>      > working. Mostly.<br clear="none">>      ><br clear="none">>      > After the third server got fixed I started a heal, but files<br clear="none">>     didn't get<br clear="none">>      > healed and count (by "ls -l<br clear="none">>      > /srv/bricks/*/d/.glusterfs/indices/xattrop/|grep ^-|wc -l") did not<br clear="none">>      > decrease over 2 days. So, to recover I copied files from bricks<br clear="none">>     to temp<br clear="none">>      > storage (keeping both copies of conflicting files with different<br clear="none">>      > contents), removed files on bricks and arbiters, and finally<br clear="none">>     copied back<br clear="none">>      > from temp storage to the volume.<br clear="none">>      ><br clear="none">>      > Now the files are accessible but I still see lots of entries like<br clear="none">>      > <gfid:57e428c7-6bed-4eb3-b9bd-02ca4c46657a><br clear="none">>      ><br clear="none">>      > IIUC that's due to a mismatch between .glusterfs/ contents and normal<br clear="none">>      > hierarchy. Is there some tool to speed up the cleanup?<br clear="none">>      ><br clear="none">>      > Tks.<br clear="none">>      ><br clear="none">>      > --<br clear="none">>      > Diego Zuccato<br clear="none">>      > DIFA - Dip. di Fisica e Astronomia<br clear="none">>      > Servizi Informatici<br clear="none">>      > Alma Mater Studiorum - Università di Bologna<br clear="none">>      > V.le Berti-Pichat 6/2 - 40127 Bologna - Italy<br clear="none">>      > tel.: +39 051 20 95786<br clear="none">>      > ________<br clear="none">>      ><br clear="none">>      ><br clear="none">>      ><br clear="none">>      > Community Meeting Calendar:<br clear="none">>      ><br clear="none">>      > Schedule -<br clear="none">>      > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br clear="none">>      > Bridge: <a shape="rect" href="https://meet.google.com/cpu-eiue-hvk" target="_blank">https://meet.google.com/cpu-eiue-hvk</a><br clear="none">>     <<a shape="rect" href="https://meet.google.com/cpu-eiue-hvk " target="_blank">https://meet.google.com/cpu-eiue-hvk </a>><br clear="none">>      > <<a shape="rect" href="https://meet.google.com/cpu-eiue-hvk" target="_blank">https://meet.google.com/cpu-eiue-hvk</a><br clear="none">>     <<a shape="rect" href="https://meet.google.com/cpu-eiue-hvk" target="_blank">https://meet.google.com/cpu-eiue-hvk</a>>><br clear="none">>      > Gluster-users mailing list<br clear="none">>      > <a shape="rect" ymailto="mailto:Gluster-users@gluster.org" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a> <mailto:<a shape="rect" ymailto="mailto:Gluster-users@gluster.org" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>><br clear="none">>     <mailto:<a shape="rect" ymailto="mailto:Gluster-users@gluster.org" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a> <mailto:<a shape="rect" ymailto="mailto:Gluster-users@gluster.org" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>>><br clear="none">>      > <a shape="rect" href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br clear="none">>     <<a shape="rect" href="https://lists.gluster.org/mailman/listinfo/gluster-users " target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users </a>><br clear="none">>      > <<a shape="rect" href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br clear="none">>     <<a shape="rect" href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a>>><div class="yqt6286094797" id="yqtfd82251"><br clear="none">> <br clear="none">> <br clear="none">>     -- <br clear="none">>     Diego Zuccato<br clear="none">>     DIFA - Dip. di Fisica e Astronomia<br clear="none">>     Servizi Informatici<br clear="none">>     Alma Mater Studiorum - Università di Bologna<br clear="none">>     V.le Berti-Pichat 6/2 - 40127 Bologna - Italy<br clear="none">>     tel.: +39 051 20 95786<br clear="none">> <br clear="none"><br clear="none">-- <br clear="none">Diego Zuccato<br clear="none">DIFA - Dip. di Fisica e Astronomia<br clear="none">Servizi Informatici<br clear="none">Alma Mater Studiorum - Università di Bologna<br clear="none">V.le Berti-Pichat 6/2 - 40127 Bologna - Italy<br clear="none">tel.: +39 051 20 95786<br clear="none"></div></div></div>            </div>                </div>