[Gluster-users] gfid entries in volume heal info that do not heal

Jim Kinney jim.kinney at gmail.com
Mon Oct 23 21:58:50 UTC 2017


I'm not so lucky. ALL of mine show 2 links and none have the attr data
that supplies the path to the original.
I have the inode from stat. Looking now to dig out the path/filename
from  xfs_db on the specific inodes individually.
Is the hash of the filename or <path>/filename and if so relative to
where? /, <path from top of brick>, ?
On Mon, 2017-10-23 at 18:54 +0000, Matt Waymack wrote:
> In my case I was able to delete the hard links in the .glusterfs
> folders of the bricks and it seems to have done the trick, thanks!
>  
> 
> From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com]
> 
> 
> Sent: Monday, October 23, 2017 1:52 AM
> 
> To: Jim Kinney <jim.kinney at gmail.com>; Matt Waymack <mwaymack at nsgdv.c
> om>
> 
> Cc: gluster-users <Gluster-users at gluster.org>
> 
> Subject: Re: [Gluster-users] gfid entries in volume heal info that do
> not heal
>  
> 
> 
> 
> Hi Jim & Matt,
> 
> Can you also check for the link count in the stat output of those
> hardlink entries in the .glusterfs folder on the bricks.
> 
> If the link count is 1 on all the bricks for those entries, then they
> are orphaned entries and you can delete those hardlinks.
> 
> 
> To be on the safer side have a backup before deleting any of the
> entries.
> 
> 
> Regards,
> 
> 
> Karthik
> 
> 
> 
>  
> 
> On Fri, Oct 20, 2017 at 3:18 AM, Jim Kinney <jim.kinney at gmail.com>
> wrote:
> > 
> > I've been following this particular thread as I have a similar
> > issue (RAID6 array failed out with 3 dead drives at once while a 12
> > TB load was being copied into one mounted space - what a mess)
> > 
> > 
> >  
> > 
> > 
> > I have >700K GFID entries that have no path data:
> > 
> > 
> > Example:
> > 
> > 
> > getfattr -d -e hex -m . .glusterfs/00/00/0000a5ef-5af7-401b-84b5-
> > ff2a51c10421
> > 
> > 
> > # file: .glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421
> > 
> > 
> > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c616265
> > 6c65645f743a733000
> > 
> > 
> > trusted.bit-rot.version=0x020000000000000059b1b316000270e7
> > 
> > 
> > trusted.gfid=0x0000a5ef5af7401b84b5ff2a51c10421
> > 
> > 
> >  
> > 
> > 
> > [root at bmidata1 brick]# getfattr -d -n trusted.glusterfs.pathinfo -e
> > hex -m . .glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421
> > 
> > 
> > .glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421:
> > trusted.glusterfs.pathinfo: No such attribute
> > 
> > 
> >  
> > 
> > 
> > I had to totally rebuild the dead RAID array and did a copy from
> > the live one before activating gluster on the rebuilt system. I
> > accidentally copied over the .glusterfs folder from the working
> > side
> > 
> > 
> > (replica 2 only for now - adding arbiter node as soon as I can get
> > this one cleaned up).
> > 
> > 
> > 
> >  
> > 
> > 
> > I've run the methods from "http://docs.gluster.org/en/latest/Troubl
> > eshooting/gfid-to-path/" with no results using random GFIDs. A full
> > systemic
> >  run using the script from method 3 crashes with "too many nested
> > links" error (or something similar).
> > 
> > 
> >  
> > 
> > 
> > When I run gluster volume heal volname info, I get 700K+ GFIDs. Oh.
> > gluster 3.8.4 on Centos 7.3
> > 
> > 
> >  
> > 
> > 
> > Should I just remove the contents of the .glusterfs folder on both
> > and restart gluster and run a ls/stat on every file?
> > 
> > 
> >  
> > 
> > 
> >  
> > 
> > 
> > When I run a heal, it no longer has a decreasing number of files to
> > heal so that's an improvement over the last 2-3 weeks :-)
> > 
> > 
> > 
> > 
> >  
> > 
> > 
> > On Tue, 2017-10-17 at 14:34 +0000, Matt Waymack wrote:
> > 
> > 
> > 
> > > 
> > > Attached is the heal log for the volume as well as the shd log. 
> > >  
> > > >  
> > > > >  
> > > > > Run these commands on all the bricks of the replica pair to
> > > > > get the attrs set on the backend.
> > > > 
> > > >  
> > > 
> > >  
> > >  
> > > [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m .
> > > /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> > > getfattr: Removing leading '/' from absolute path names
> > > # file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-
> > > ad6a15d811a2
> > > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162
> > > 656c65645f743a733000
> > > trusted.afr.dirty=0x000000000000000000000000
> > > trusted.afr.gv0-client-2=0x000000000000000100000000
> > > trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2
> > > trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d
> > > 346463622d393630322d3839356136396461363131662f435f564f4c2d6230303
> > > 12d693637342d63642d63772e6d6435
> > >  
> > > [root at tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m .
> > > /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> > > getfattr: Removing leading '/' from absolute path names
> > > # file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-
> > > ad6a15d811a2
> > > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162
> > > 656c65645f743a733000
> > > trusted.afr.dirty=0x000000000000000000000000
> > > trusted.afr.gv0-client-2=0x000000000000000100000000
> > > trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2
> > > trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d
> > > 346463622d393630322d3839356136396461363131662f435f564f4c2d6230303
> > > 12d693637342d63642d63772e6d6435
> > >  
> > > [root at tpc-arbiter1-100617 ~]# getfattr -d -e hex -m .
> > > /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> > > getfattr: /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-
> > > ad6a15d811a2: No such file or directory
> > >  
> > >  
> > > [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m .
> > > /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> > > getfattr: Removing leading '/' from absolute path names
> > > # file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-
> > > e46b92d33df3
> > > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162
> > > 656c65645f743a733000
> > > trusted.afr.dirty=0x000000000000000000000000
> > > trusted.afr.gv0-client-11=0x000000000000000100000000
> > > trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3
> > > trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d
> > > 343033382d393131622d3866373063656334616136662f435f564f4c2d6230303
> > > 32d69313331342d63642d636d2d63722e6d6435
> > >  
> > > [root at tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m .
> > > /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> > > getfattr: Removing leading '/' from absolute path names
> > > # file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-
> > > e46b92d33df3
> > > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162
> > > 656c65645f743a733000
> > > trusted.afr.dirty=0x000000000000000000000000
> > > trusted.afr.gv0-client-11=0x000000000000000100000000
> > > trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3
> > > trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d
> > > 343033382d393131622d3866373063656334616136662f435f564f4c2d6230303
> > > 32d69313331342d63642d636d2d63722e6d6435
> > >  
> > > [root at tpc-arbiter1-100617 ~]# getfattr -d -e hex -m .
> > > /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> > > getfattr: /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-
> > > e46b92d33df3: No such file or directory
> > >  
> > > >  
> > > > >  
> > > > > And the output of "gluster volume heal <volname> info split-
> > > > > brain"
> > > > 
> > > >  
> > > 
> > >  
> > >  
> > > [root at tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info
> > > split-brain
> > > Brick tpc-cent-glus1-081017:/exp/b1/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b1/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b1/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-cent-glus1-081017:/exp/b2/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b2/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b2/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-cent-glus1-081017:/exp/b3/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b3/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b3/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-cent-glus1-081017:/exp/b4/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b4/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b4/gv0
> > > Status: Connected
> > > Number of entries in split-brain: 0
> > >  
> > > -Matt
> > >  
> > > From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com] 
> > > Sent: Tuesday, October 17, 2017 1:26 AM
> > > To: Matt Waymack <mwaymack at nsgdv.com>
> > > Cc: gluster-users <Gluster-users at gluster.org>
> > > Subject: Re: [Gluster-users] gfid entries in volume heal info
> > > that do not heal
> > >  
> > > Hi Matt,
> > >  
> > > Run these commands on all the bricks of the replica pair to get
> > > the attrs set on the backend.
> > >  
> > > On the bricks of first replica set:
> > > getfattr -d -e hex -m . <brick path>/.glusterfs/10/86/108694db-
> > > c039-4b7c-bd3d-ad6a15d811a2
> > > On the fourth replica set:
> > > getfattr -d -e hex -m . <brick path>/.glusterfs/e0/c5/e0c56bf7-
> > > 8bfe-46ca-bde1-e46b92d33df3
> > > Also run the "gluster volume heal <volname>" once and send the
> > > shd log.
> > > And the output of "gluster volume heal <volname> info split-
> > > brain"
> > > Regards,
> > > Karthik
> > >  
> > > On Mon, Oct 16, 2017 at 9:51 PM, Matt Waymack <mailto:mwaymack at ns
> > > gdv.com> wrote:
> > > OK, so here’s my output of the volume info and the heal info. I
> > > have not yet tracked down physical location of these files, any
> > > tips to finding them would be appreciated, but I’m definitely
> > > just wanting them gone.  I forgot to mention earlier that the
> > > cluster is running 3.12 and was upgraded from 3.10; these files
> > > were likely stuck like this when it was on 3.10.
> > >  
> > > [root at tpc-cent-glus1-081017 ~]# gluster volume info gv0
> > >  
> > > Volume Name: gv0
> > > Type: Distributed-Replicate
> > > Volume ID: 8f07894d-e3ab-4a65-bda1-9d9dd46db007
> > > Status: Started
> > > Snapshot Count: 0
> > > Number of Bricks: 4 x (2 + 1) = 12
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: tpc-cent-glus1-081017:/exp/b1/gv0
> > > Brick2: tpc-cent-glus2-081017:/exp/b1/gv0
> > > Brick3: tpc-arbiter1-100617:/exp/b1/gv0 (arbiter)
> > > Brick4: tpc-cent-glus1-081017:/exp/b2/gv0
> > > Brick5: tpc-cent-glus2-081017:/exp/b2/gv0
> > > Brick6: tpc-arbiter1-100617:/exp/b2/gv0 (arbiter)
> > > Brick7: tpc-cent-glus1-081017:/exp/b3/gv0
> > > Brick8: tpc-cent-glus2-081017:/exp/b3/gv0
> > > Brick9: tpc-arbiter1-100617:/exp/b3/gv0 (arbiter)
> > > Brick10: tpc-cent-glus1-081017:/exp/b4/gv0
> > > Brick11: tpc-cent-glus2-081017:/exp/b4/gv0
> > > Brick12: tpc-arbiter1-100617:/exp/b4/gv0 (arbiter)
> > > Options Reconfigured:
> > > nfs.disable: on
> > > transport.address-family: inet
> > >  
> > > [root at tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info
> > > Brick tpc-cent-glus1-081017:/exp/b1/gv0
> > > <gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>
> > > <gfid:6d5ade20-8996-4de2-95d5-20ef98004742>
> > > <gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>
> > > <gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>
> > > <gfid:053e2fb1-bc89-476e-a529-90dffa39963c>
> > >  
> > > <removed to save scrolling>
> > >  
> > > Status: Connected
> > > Number of entries: 118
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b1/gv0
> > > <gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>
> > > <gfid:6d5ade20-8996-4de2-95d5-20ef98004742>
> > > <gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>
> > > <gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>
> > > <gfid:053e2fb1-bc89-476e-a529-90dffa39963c>
> > >  
> > > <removed to save scrolling>
> > >  
> > > Status: Connected
> > > Number of entries: 118
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b1/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Brick tpc-cent-glus1-081017:/exp/b2/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b2/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b2/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Brick tpc-cent-glus1-081017:/exp/b3/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b3/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b3/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Brick tpc-cent-glus1-081017:/exp/b4/gv0
> > > <gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>
> > > <gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>
> > > <gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>
> > > <gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>
> > > <gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>
> > > <gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>
> > > <gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>
> > > <gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>
> > > <gfid:51ff7386-a550-4971-957c-b42c4d915e9f>
> > > <gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>
> > > <gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>
> > > <gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>
> > > <gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>
> > > <gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>
> > > <gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>
> > > <gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>
> > > <gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>
> > > <gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>
> > > <gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>
> > > <gfid:6043e381-7edf-4569-bc37-e27dd13549d2>
> > > <gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>
> > > <gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>
> > > <gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>
> > > <gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>
> > > Status: Connected
> > > Number of entries: 24
> > >  
> > > Brick tpc-cent-glus2-081017:/exp/b4/gv0
> > > <gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>
> > > <gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>
> > > <gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>
> > > <gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>
> > > <gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>
> > > <gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>
> > > <gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>
> > > <gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>
> > > <gfid:51ff7386-a550-4971-957c-b42c4d915e9f>
> > > <gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>
> > > <gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>
> > > <gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>
> > > <gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>
> > > <gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>
> > > <gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>
> > > <gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>
> > > <gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>
> > > <gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>
> > > <gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>
> > > <gfid:6043e381-7edf-4569-bc37-e27dd13549d2>
> > > <gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>
> > > <gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>
> > > <gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>
> > > <gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>
> > > Status: Connected
> > > Number of entries: 24
> > >  
> > > Brick tpc-arbiter1-100617:/exp/b4/gv0
> > > Status: Connected
> > > Number of entries: 0
> > >  
> > > Thank you for your help!
> > >  
> > > From: Karthik Subrahmanya [mailto:mailto:ksubrahm at redhat.com] 
> > > Sent: Monday, October 16, 2017 10:27 AM
> > > To: Matt Waymack <mailto:mwaymack at nsgdv.com>
> > > Cc: gluster-users <mailto:Gluster-users at gluster.org>
> > > Subject: Re: [Gluster-users] gfid entries in volume heal info
> > > that do not heal
> > >  
> > > Hi Matt, 
> > >  
> > > The files might be in split brain. Could you please send the
> > > outputs of these? 
> > > gluster volume info <volname>
> > > gluster volume heal <volname> info
> > > And also the getfattr output of the files which are in the heal
> > > info output from all the bricks of that replica pair.
> > > getfattr -d -e hex -m . <file path on brick>
> > >  
> > > Thanks &  Regards
> > > Karthik
> > >  
> > > On 16-Oct-2017 8:16 PM, "Matt Waymack" <mailto:mwaymack at nsgdv.com
> > > > wrote:
> > > Hi all,
> > >  
> > > I have a volume where the output of volume heal info shows
> > > several gfid entries to be healed, but they’ve been there for
> > > weeks and have not healed.  Any normal file that shows up on the
> > > heal info does get healed as expected, but these gfid entries do
> > > not.  Is there any way to remove these orphaned entries from the
> > > volume so they are no longer stuck in the heal process?
> > >  
> > > Thank you!
> > >  
> > > _______________________________________________
> > > Gluster-users mailing list
> > > mailto:Gluster-users at gluster.org
> > > http://lists.gluster.org/mailman/listinfo/gluster-users
> > >  
> > >  
> > > 
> > > 
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://lists.gluster.org/mailman/listinfo/gluster-users
> 
>  
> 
> 
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171023/87ef180d/attachment.html>


More information about the Gluster-users mailing list