[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica

Ravishankar N ravishankar at redhat.com
Thu Nov 15 04:57:06 UTC 2018



On 11/14/2018 03:19 PM, mabi wrote:
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Wednesday, November 14, 2018 5:34 AM, Ravishankar N <ravishankar at redhat.com> wrote:
>
>> I thought it was missing which is why I asked you to create it.  The
>> trusted.gfid xattr for any given file or directory must be same in all 3
>> bricks.  But it looks like that isn't the case. Are the gfids and the
>> symlinks for all the dirs leading to the parent dir of oc_dir same on
>> all nodes? (i.e evey directory in
>> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/)?
> I now checked the GFIDs of all directories leading back down to the parent dir (13 directories in total) and for node 1 and node 3 the GIFDs of all underlying directories match each other. On node 2 they are also all the same except for the two highest directories (".../dir11" and and ".../dir11/oc_dir"). It's exactly these two directories which are also listed in the "volume heal info" output under node 1 and node 2 and which do not get healed.
>
> For your reference I have pasted below the GFIDs for all underlying directories up to the parent directory and for all 3 nodes. I start at the top with the highest directory and at the bottom of the list is the parent directory (/data).
>
> # NODE 1
>
> trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir
> trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11
> trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 # ...
> trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82
> trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4
> trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94
> trusted.gfid=0xf120657977274247900db4e9cc8129dd
> trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9
> trusted.gfid=0x2174086880fc4fd19b187d1384300add
> trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 # ...
> trusted.gfid=0xa7d78519db61459399e01fad2badf3fb # /data/dir1/dir2
> trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 # /data/dir1
> trusted.gfid=0x2683990126724adbb6416b911180e62b # /data
>
> # NODE 2
>
> trusted.gfid=0xd9ac192ce85e4402af105551f587ed9a
> trusted.gfid=0x10ec1eb1c8544ff2a36c325681713093
> trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269
> trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82
> trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4
> trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94
> trusted.gfid=0xf120657977274247900db4e9cc8129dd
> trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9
> trusted.gfid=0x2174086880fc4fd19b187d1384300add
> trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01
> trusted.gfid=0xa7d78519db61459399e01fad2badf3fb
> trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4
> trusted.gfid=0x2683990126724adbb6416b911180e62b
>
> # NODE 3
>
> trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19
> trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd
> trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269
> trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82
> trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4
> trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94
> trusted.gfid=0xf120657977274247900db4e9cc8129dd
> trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9
> trusted.gfid=0x2174086880fc4fd19b187d1384300add
> trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01
> trusted.gfid=0xa7d78519db61459399e01fad2badf3fb
> trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4
> trusted.gfid=0x2683990126724adbb6416b911180e62b
>
>
>> Let us see if the parents' gfids are the same before deleting anything.
>> Is the heal info still showing 4 entries? Please also share the getfattr
>> output of the the parent directory (i.e. dir11) .
> Yes, the heal info still shows the 4 entries but on node 1 the directory name is not shown anymore but just the GFID. This is the actual output of a "volume heal info":
>
> Brick node1:/data/myvol-pro/brick
> <gfid:25e2616b-4fb6-4b2a-8945-1afc956fff19>
> <gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360>
> <gfid:70c894ca-422b-4bce-acf1-5cfb4669abbd>
> <gfid:aae4098a-1a71-4155-9cc9-e564b89957cf>
> Status: Connected
> Number of entries: 4
>
> Brick node2:/data/myvol-pro/brick
> Status: Connected
> Number of entries: 0
>
> Brick node3:/srv/glusterfs/myvol-pro/brick
> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11
> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir
> <gfid:aae4098a-1a71-4155-9cc9-e564b89957cf>
> <gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360>
> Status: Connected
> Number of entries: 4
>
> What are the next steps in order to fix that?
1.Could you provide the getfattr output of the following 3 dirs from all 
3 nodes?
i)/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10
ii)/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/
iii)/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir

2. Do you know the file (or directory) names corresponding to the other 
2 gfids  in heal info output, i.e
<gfid:aae4098a-1a71-4155-9cc9-e564b89957cf>
<gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360>
Please share the getfattr output of them as well.

Regards,
Ravi


More information about the Gluster-users mailing list