[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica

mabi mabi at protonmail.ch
Wed Nov 14 09:49:24 UTC 2018


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, November 14, 2018 5:34 AM, Ravishankar N <ravishankar at redhat.com> wrote:

> I thought it was missing which is why I asked you to create it.  The
> trusted.gfid xattr for any given file or directory must be same in all 3
> bricks.  But it looks like that isn't the case. Are the gfids and the
> symlinks for all the dirs leading to the parent dir of oc_dir same on
> all nodes? (i.e evey directory in
> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/)?

I now checked the GFIDs of all directories leading back down to the parent dir (13 directories in total) and for node 1 and node 3 the GIFDs of all underlying directories match each other. On node 2 they are also all the same except for the two highest directories (".../dir11" and and ".../dir11/oc_dir"). It's exactly these two directories which are also listed in the "volume heal info" output under node 1 and node 2 and which do not get healed.

For your reference I have pasted below the GFIDs for all underlying directories up to the parent directory and for all 3 nodes. I start at the top with the highest directory and at the bottom of the list is the parent directory (/data).

# NODE 1

trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir
trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11
trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 # ...
trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82
trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4
trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94
trusted.gfid=0xf120657977274247900db4e9cc8129dd
trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9
trusted.gfid=0x2174086880fc4fd19b187d1384300add
trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 # ...
trusted.gfid=0xa7d78519db61459399e01fad2badf3fb # /data/dir1/dir2
trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 # /data/dir1
trusted.gfid=0x2683990126724adbb6416b911180e62b # /data

# NODE 2

trusted.gfid=0xd9ac192ce85e4402af105551f587ed9a
trusted.gfid=0x10ec1eb1c8544ff2a36c325681713093
trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269
trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82
trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4
trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94
trusted.gfid=0xf120657977274247900db4e9cc8129dd
trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9
trusted.gfid=0x2174086880fc4fd19b187d1384300add
trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01
trusted.gfid=0xa7d78519db61459399e01fad2badf3fb
trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4
trusted.gfid=0x2683990126724adbb6416b911180e62b

# NODE 3

trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19
trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd
trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269
trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82
trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4
trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94
trusted.gfid=0xf120657977274247900db4e9cc8129dd
trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9
trusted.gfid=0x2174086880fc4fd19b187d1384300add
trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01
trusted.gfid=0xa7d78519db61459399e01fad2badf3fb
trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4
trusted.gfid=0x2683990126724adbb6416b911180e62b


> Let us see if the parents' gfids are the same before deleting anything.
> Is the heal info still showing 4 entries? Please also share the getfattr
> output of the the parent directory (i.e. dir11) .

Yes, the heal info still shows the 4 entries but on node 1 the directory name is not shown anymore but just the GFID. This is the actual output of a "volume heal info":

Brick node1:/data/myvol-pro/brick
<gfid:25e2616b-4fb6-4b2a-8945-1afc956fff19>
<gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360>
<gfid:70c894ca-422b-4bce-acf1-5cfb4669abbd>
<gfid:aae4098a-1a71-4155-9cc9-e564b89957cf>
Status: Connected
Number of entries: 4

Brick node2:/data/myvol-pro/brick
Status: Connected
Number of entries: 0

Brick node3:/srv/glusterfs/myvol-pro/brick
/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11
/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir
<gfid:aae4098a-1a71-4155-9cc9-e564b89957cf>
<gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360>
Status: Connected
Number of entries: 4

What are the next steps in order to fix that?


More information about the Gluster-users mailing list