[Gluster-users] Trying to fix files that don't want to heal

Gudrun Mareike Amedick g.amedick at uni-luebeck.de
Tue Dec 10 11:02:38 UTC 2019


Hi,

so I ran gfid_needing_heal_parallel_new.sh. First try, I got a whole bunch of "ssh_exchange_identification: Connection closed by remote host",
increasing MaxStartups in sshd_config fixed that. The results were the same in both runs though, so that might be a cosmetic problem.

I ended up with a file that contains lines like this for each broken file $FILE /$SERVER and $BRICK iterate over all servers and bricks respectively):
file:$NUMBER|$NUMBER|$SERVER:/$BRICK#$FILE,

So I'm feeding that into correct_pending_heals.sh and I should be done, right?

I have an entry in my heal info for a file that doesn't exist anymore, I deleted it a while ago. Looking into the dir structure revealed a stray link
file (permissions -----T) for that file on some servers. Not sure why that wasn't deleted. It's in potential_heal though, the line starts with
"empty|$SERVER(..)". I'll delete that line from potential_heal. It might be an interesting edge case though.

Kind regards

Gudrun


Am Montag, den 02.12.2019, 02:15 -0500 schrieb Ashish Pandey:
> 
> 
> From: "Gudrun Mareike Amedick" <g.amedick at uni-luebeck.de>
> To: "Ashish Pandey" <aspandey at redhat.com>
> Cc: "Gluster-users" <gluster-users at gluster.org>
> Sent: Friday, November 29, 2019 8:45:13 PM
> Subject: Re: [Gluster-users] Trying to fix files that don't want to heal
> 
> Hi Ashish,
> 
> thanks for your reply. To fulfill the "no IO"-requirement, I'll have to wait until second week of December (9th – 14th). 
> 
> We originally planned to update GlusterFS from 4.1.7 to 5 and then to 6 in December. Should we do that upgrade before or after running those
> scripts?
> The best 
> 
> >> It will be best if you could do it before upgrading to newer version.
> BTW, why are you not planing to upgrade to gluster 7?
> 
> 
> Kind regards
> 
> GudrunAm Freitag, den 29.11.2019, 00:38 -0500 schrieb Ashish Pandey:
> > Hey Gudrun,
>> > Could you please try to use the scripts and try to resolve it. 
> > We have written some scripts and it is in final phase to get merge - 
> > https://review.gluster.org/#/c/glusterfs/+/23380/
>> > You can find the steps to use these scripts in README.md file
>> > ---
> > Ashish
>> > From: "Gudrun Mareike Amedick" <g.amedick at uni-luebeck.de>
> > To: "Gluster-users" <gluster-users at gluster.org>
> > Sent: Thursday, November 28, 2019 3:57:18 PM
> > Subject: [Gluster-users] Trying to fix files that don't want to heal
>> > Hi,
>> > I have a distributed dispersed volume with files that don't want to heal. I'm trying to fix them manually. 
>> > I'm currently working on a file that is present on all bricks, GFID exists in .glusterfs-structure and getfattr shows identical attributes for all
> > files. They look like this:
>> > # getfattr -m. -d -e hex $brick/somepath/libssl.so.1.1
> > getfattr: Removing leading '/' from absolute path names
> > # file: $brick/$somepath/libssl.so.1.1
> > trusted.ec.config=0x0000080602000200
> > trusted.ec.dirty=0x00000000000000010000000000000000
> > trusted.ec.size=0x00000000000a0000
> > trusted.ec.version=0x00000000000000040000000000000005
> > trusted.gfid=0xdd7dd64f6bb34b5f891a5e32fe83874f
> > trusted.gfid2path.0c3a5b76c518ef60=0x34663064396234332d343730342d343634352d613834342d3338303532336137346632662f6c696273736c2e736f2e312e31
> > trusted.gfid2path.578ce2ec37aa0f9d=0x31636136323433342d396132642d343039362d616265352d6463353065613131333066632f6c696273736c2e736f2e312e31
> > trusted.glusterfs.quota.1ca62434-9a2d-4096-abe5-dc50ea1130fc.contri.3=0x00000000000292000000000000000001
> > trusted.glusterfs.quota.4f0d9b43-4704-4645-a844-380523a74f2f.contri.3=0x00000000000292000000000000000001
> > trusted.pgfid.1ca62434-9a2d-4096-abe5-dc50ea1130fc=0x00000001
> > trusted.pgfid.4f0d9b43-4704-4645-a844-380523a74f2f=0x00000001
>> > pgfid is "parent gfid" right? Both GFID's refer to a dir in my volume, both of those dirs contain a file named libssl.so.1.1. They seem to be
> > hardlinks:
>> > find  $brick/$somepath  -samefile  $brick/$someotherpath/libssl.so.1.1
> > $brick/$somepath/libssl.so.1
>> > This exceeds the limits of my GlusterFS knowledge. Is that something that can/should happen? If not, is it the reason that file won't heal and how
> > do
> > I fix that?
>> > Kind regards
>> > Gudrun Amedick
> > ________
>> > Community Meeting Calendar:
>> > APAC Schedule -
> > Every 2nd and 4th Tuesday at 11:30 AM IST
> > Bridge: https://bluejeans.com/441850968
>> > NA/EMEA Schedule -
> > Every 1st and 3rd Tuesday at 01:00 PM EDT
> > Bridge: https://bluejeans.com/441850968
>> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
>> ________
> 
> Community Meeting Calendar:
> 
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
> 
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
> 
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6743 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191210/022fd672/attachment.bin>


More information about the Gluster-users mailing list