[Gluster-users] gfid entries in volume heal info that do not heal

Jim Kinney jim.kinney at gmail.com
Thu Oct 19 21:48:08 UTC 2017


I've been following this particular thread as I have a similar issue
(RAID6 array failed out with 3 dead drives at once while a 12 TB load
was being copied into one mounted space - what a mess)
I have >700K GFID entries that have no path data:Example:getfattr -d -e
hex -m . .glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421# file:
.glusterfs/00/00/0000a5ef-5af7-401b-84b5-
ff2a51c10421security.selinux=0x73797374656d5f753a6f626a6563745f723a756e
6c6162656c65645f743a733000trusted.bit-
rot.version=0x020000000000000059b1b316000270e7trusted.gfid=0x0000a5ef5a
f7401b84b5ff2a51c10421
[root at bmidata1 brick]# getfattr -d -n trusted.glusterfs.pathinfo -e hex
-m . .glusterfs/00/00/0000a5ef-5af7-401b-84b5-
ff2a51c10421.glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421:
trusted.glusterfs.pathinfo: No such attribute
I had to totally rebuild the dead RAID array and did a copy from the
live one before activating gluster on the rebuilt system. I
accidentally copied over the .glusterfs folder from the working
side(replica 2 only for now - adding arbiter node as soon as I can get
this one cleaned up). 
I've run the methods from "http://docs.gluster.org/en/latest/Troublesho
oting/gfid-to-path/" with no results using random GFIDs. A full
systemic run using the script from method 3 crashes with "too many
nested links" error (or something similar).
When I run gluster volume heal volname info, I get 700K+ GFIDs. Oh.
gluster 3.8.4 on Centos 7.3
Should I just remove the contents of the .glusterfs folder on both and
restart gluster and run a ls/stat on every file?

When I run a heal, it no longer has a decreasing number of files to
heal so that's an improvement over the last 2-3 weeks :-)
On Tue, 2017-10-17 at 14:34 +0000, Matt Waymack wrote:
> Attached is the heal log for the volume as well as the shd log. 
> 
> > > Run these commands on all the bricks of the replica pair to get
> > > the attrs set on the backend.
> 
> [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m .
> /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-
> ad6a15d811a2
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c
> 65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-2=0x000000000000000100000000
> trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2
> trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d3464
> 63622d393630322d3839356136396461363131662f435f564f4c2d623030312d69363
> 7342d63642d63772e6d6435
> 
> [root at tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m .
> /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-
> ad6a15d811a2
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c
> 65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-2=0x000000000000000100000000
> trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2
> trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d3464
> 63622d393630322d3839356136396461363131662f435f564f4c2d623030312d69363
> 7342d63642d63772e6d6435
> 
> [root at tpc-arbiter1-100617 ~]# getfattr -d -e hex -m .
> /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> getfattr: /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-
> ad6a15d811a2: No such file or directory
> 
> 
> [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m .
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-
> e46b92d33df3
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c
> 65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-11=0x000000000000000100000000
> trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3
> trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d3430
> 33382d393131622d3866373063656334616136662f435f564f4c2d623030332d69313
> 331342d63642d636d2d63722e6d6435
> 
> [root at tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m .
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-
> e46b92d33df3
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c
> 65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-11=0x000000000000000100000000
> trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3
> trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d3430
> 33382d393131622d3866373063656334616136662f435f564f4c2d623030332d69313
> 331342d63642d636d2d63722e6d6435
> 
> [root at tpc-arbiter1-100617 ~]# getfattr -d -e hex -m .
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-
> e46b92d33df3: No such file or directory
> 
> > > And the output of "gluster volume heal <volname> info split-
> > > brain"
> 
> [root at tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info split-
> brain
> Brick tpc-cent-glus1-081017:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-cent-glus2-081017:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-arbiter1-100617:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-arbiter1-100617:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> Brick tpc-arbiter1-100617:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
> 
> -Matt
> 
> From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com] 
> Sent: Tuesday, October 17, 2017 1:26 AM
> To: Matt Waymack <mwaymack at nsgdv.com>
> Cc: gluster-users <Gluster-users at gluster.org>
> Subject: Re: [Gluster-users] gfid entries in volume heal info that do
> not heal
> 
> Hi Matt,
> 
> Run these commands on all the bricks of the replica pair to get the
> attrs set on the backend.
> 
> On the bricks of first replica set:
> getfattr -d -e hex -m . <brick path>/.glusterfs/10/86/108694db-c039-
> 4b7c-bd3d-ad6a15d811a2
> On the fourth replica set:
> getfattr -d -e hex -m . <brick path>/.glusterfs/e0/c5/e0c56bf7-8bfe-
> 46ca-bde1-e46b92d33df3
> Also run the "gluster volume heal <volname>" once and send the shd
> log.
> And the output of "gluster volume heal <volname> info split-brain"
> Regards,
> Karthik
> 
> On Mon, Oct 16, 2017 at 9:51 PM, Matt Waymack <mailto:mwaymack at nsgdv.
> com> wrote:
> OK, so here’s my output of the volume info and the heal info. I have
> not yet tracked down physical location of these files, any tips to
> finding them would be appreciated, but I’m definitely just wanting
> them gone.  I forgot to mention earlier that the cluster is running
> 3.12 and was upgraded from 3.10; these files were likely stuck like
> this when it was on 3.10.
>  
> [root at tpc-cent-glus1-081017 ~]# gluster volume info gv0
>  
> Volume Name: gv0
> Type: Distributed-Replicate
> Volume ID: 8f07894d-e3ab-4a65-bda1-9d9dd46db007
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 4 x (2 + 1) = 12
> Transport-type: tcp
> Bricks:
> Brick1: tpc-cent-glus1-081017:/exp/b1/gv0
> Brick2: tpc-cent-glus2-081017:/exp/b1/gv0
> Brick3: tpc-arbiter1-100617:/exp/b1/gv0 (arbiter)
> Brick4: tpc-cent-glus1-081017:/exp/b2/gv0
> Brick5: tpc-cent-glus2-081017:/exp/b2/gv0
> Brick6: tpc-arbiter1-100617:/exp/b2/gv0 (arbiter)
> Brick7: tpc-cent-glus1-081017:/exp/b3/gv0
> Brick8: tpc-cent-glus2-081017:/exp/b3/gv0
> Brick9: tpc-arbiter1-100617:/exp/b3/gv0 (arbiter)
> Brick10: tpc-cent-glus1-081017:/exp/b4/gv0
> Brick11: tpc-cent-glus2-081017:/exp/b4/gv0
> Brick12: tpc-arbiter1-100617:/exp/b4/gv0 (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
>  
> [root at tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info
> Brick tpc-cent-glus1-081017:/exp/b1/gv0
> <gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>
> <gfid:6d5ade20-8996-4de2-95d5-20ef98004742>
> <gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>
> <gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>
> <gfid:053e2fb1-bc89-476e-a529-90dffa39963c>
>  
> <removed to save scrolling>
>  
> Status: Connected
> Number of entries: 118
>  
> Brick tpc-cent-glus2-081017:/exp/b1/gv0
> <gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>
> <gfid:6d5ade20-8996-4de2-95d5-20ef98004742>
> <gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>
> <gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>
> <gfid:053e2fb1-bc89-476e-a529-90dffa39963c>
>  
> <removed to save scrolling>
>  
> Status: Connected
> Number of entries: 118
>  
> Brick tpc-arbiter1-100617:/exp/b1/gv0
> Status: Connected
> Number of entries: 0
>  
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>  
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>  
> Brick tpc-arbiter1-100617:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>  
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>  
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>  
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>  
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> <gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>
> <gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>
> <gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>
> <gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>
> <gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>
> <gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>
> <gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>
> <gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>
> <gfid:51ff7386-a550-4971-957c-b42c4d915e9f>
> <gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>
> <gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>
> <gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>
> <gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>
> <gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>
> <gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>
> <gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>
> <gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>
> <gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>
> <gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>
> <gfid:6043e381-7edf-4569-bc37-e27dd13549d2>
> <gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>
> <gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>
> <gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>
> <gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>
> Status: Connected
> Number of entries: 24
>  
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> <gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>
> <gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>
> <gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>
> <gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>
> <gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>
> <gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>
> <gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>
> <gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>
> <gfid:51ff7386-a550-4971-957c-b42c4d915e9f>
> <gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>
> <gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>
> <gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>
> <gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>
> <gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>
> <gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>
> <gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>
> <gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>
> <gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>
> <gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>
> <gfid:6043e381-7edf-4569-bc37-e27dd13549d2>
> <gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>
> <gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>
> <gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>
> <gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>
> Status: Connected
> Number of entries: 24
>  
> Brick tpc-arbiter1-100617:/exp/b4/gv0
> Status: Connected
> Number of entries: 0
>  
> Thank you for your help!
>  
> From: Karthik Subrahmanya [mailto:mailto:ksubrahm at redhat.com] 
> Sent: Monday, October 16, 2017 10:27 AM
> To: Matt Waymack <mailto:mwaymack at nsgdv.com>
> Cc: gluster-users <mailto:Gluster-users at gluster.org>
> Subject: Re: [Gluster-users] gfid entries in volume heal info that do
> not heal
>  
> Hi Matt, 
>  
> The files might be in split brain. Could you please send the outputs
> of these? 
> gluster volume info <volname>
> gluster volume heal <volname> info
> And also the getfattr output of the files which are in the heal info
> output from all the bricks of that replica pair.
> getfattr -d -e hex -m . <file path on brick>
>  
> Thanks &  Regards
> Karthik
>  
> On 16-Oct-2017 8:16 PM, "Matt Waymack" <mailto:mwaymack at nsgdv.com>
> wrote:
> Hi all,
>  
> I have a volume where the output of volume heal info shows several
> gfid entries to be healed, but they’ve been there for weeks and have
> not healed.  Any normal file that shows up on the heal info does get
> healed as expected, but these gfid entries do not.  Is there any way
> to remove these orphaned entries from the volume so they are no
> longer stuck in the heal process?
>  
> Thank you!
> 
> _______________________________________________
> Gluster-users mailing list
> mailto:Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>  
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171019/a072a6b4/attachment.html>


More information about the Gluster-users mailing list