[Gluster-users] gfid entries in volume heal info that do not heal

Karthik Subrahmanya ksubrahm at redhat.com
Wed Oct 18 09:34:20 UTC 2017


Hey Matt,

>From the xattr output, it looks like the files are not present on the
arbiter brick & needs healing. But on the parent it does not have the
pending markers set for those entries.
The workaround for this is you need to do a lookup on the file which needs
heal from the mount, so it will create the entry on the arbiter brick and
then run the volume heal to do the healing.
Follow these steps to resolve the issue: (first try this on one file and
check whether it gets healed. If it gets healed then do this for all the
remaining files)
1. Get the file path for the gfids you got from heal info output.
    find <brickpath> -samefile <brickpath/.glusterfs/<first two bits of
gfid>/<next 2 bits of gfid>/<full gfid>
2. Do ls/stat on the file from mount.
3. Run volume heal.
4. Check the heal info output to see whether the file got healed.

If one file gets healed, then do step 1 & 2 for the rest of the files and
do step 3 & 4 once at the end.
Let me know if that resolves the issue.

Thanks & Regards,
Karthik

On Tue, Oct 17, 2017 at 8:04 PM, Matt Waymack <mwaymack at nsgdv.com> wrote:

> Attached is the heal log for the volume as well as the shd log.
>
> >> Run these commands on all the bricks of the replica pair to get the
> attrs set on the backend.
>
> [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m .
> /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-2=0x000000000000000100000000
> trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2
> trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d
> 346463622d393630322d3839356136396461363131662f435f564f4c2d62
> 3030312d693637342d63642d63772e6d6435
>
> [root at tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m .
> /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-2=0x000000000000000100000000
> trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2
> trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d
> 346463622d393630322d3839356136396461363131662f435f564f4c2d62
> 3030312d693637342d63642d63772e6d6435
>
> [root at tpc-arbiter1-100617 ~]# getfattr -d -e hex -m .
> /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> getfattr: /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2:
> No such file or directory
>
>
> [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m .
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-11=0x000000000000000100000000
> trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3
> trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d
> 343033382d393131622d3866373063656334616136662f435f564f4c2d62
> 3030332d69313331342d63642d636d2d63722e6d6435
>
> [root at tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m .
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gv0-client-11=0x000000000000000100000000
> trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3
> trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d
> 343033382d393131622d3866373063656334616136662f435f564f4c2d62
> 3030332d69313331342d63642d636d2d63722e6d6435
>
> [root at tpc-arbiter1-100617 ~]# getfattr -d -e hex -m .
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3:
> No such file or directory
>
> >> And the output of "gluster volume heal <volname> info split-brain"
>
> [root at tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info split-brain
> Brick tpc-cent-glus1-081017:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> -Matt
>
> From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com]
> Sent: Tuesday, October 17, 2017 1:26 AM
> To: Matt Waymack <mwaymack at nsgdv.com>
> Cc: gluster-users <Gluster-users at gluster.org>
> Subject: Re: [Gluster-users] gfid entries in volume heal info that do not
> heal
>
> Hi Matt,
>
> Run these commands on all the bricks of the replica pair to get the attrs
> set on the backend.
>
> On the bricks of first replica set:
> getfattr -d -e hex -m . <brick path>/.glusterfs/10/86/
> 108694db-c039-4b7c-bd3d-ad6a15d811a2
> On the fourth replica set:
> getfattr -d -e hex -m . <brick path>/.glusterfs/e0/c5/
> e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> Also run the "gluster volume heal <volname>" once and send the shd log.
> And the output of "gluster volume heal <volname> info split-brain"
> Regards,
> Karthik
>
> On Mon, Oct 16, 2017 at 9:51 PM, Matt Waymack <mailto:mwaymack at nsgdv.com>
> wrote:
> OK, so here’s my output of the volume info and the heal info. I have not
> yet tracked down physical location of these files, any tips to finding them
> would be appreciated, but I’m definitely just wanting them gone.  I forgot
> to mention earlier that the cluster is running 3.12 and was upgraded from
> 3.10; these files were likely stuck like this when it was on 3.10.
>
> [root at tpc-cent-glus1-081017 ~]# gluster volume info gv0
>
> Volume Name: gv0
> Type: Distributed-Replicate
> Volume ID: 8f07894d-e3ab-4a65-bda1-9d9dd46db007
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 4 x (2 + 1) = 12
> Transport-type: tcp
> Bricks:
> Brick1: tpc-cent-glus1-081017:/exp/b1/gv0
> Brick2: tpc-cent-glus2-081017:/exp/b1/gv0
> Brick3: tpc-arbiter1-100617:/exp/b1/gv0 (arbiter)
> Brick4: tpc-cent-glus1-081017:/exp/b2/gv0
> Brick5: tpc-cent-glus2-081017:/exp/b2/gv0
> Brick6: tpc-arbiter1-100617:/exp/b2/gv0 (arbiter)
> Brick7: tpc-cent-glus1-081017:/exp/b3/gv0
> Brick8: tpc-cent-glus2-081017:/exp/b3/gv0
> Brick9: tpc-arbiter1-100617:/exp/b3/gv0 (arbiter)
> Brick10: tpc-cent-glus1-081017:/exp/b4/gv0
> Brick11: tpc-cent-glus2-081017:/exp/b4/gv0
> Brick12: tpc-arbiter1-100617:/exp/b4/gv0 (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
>
> [root at tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info
> Brick tpc-cent-glus1-081017:/exp/b1/gv0
> <gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>
> <gfid:6d5ade20-8996-4de2-95d5-20ef98004742>
> <gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>
> <gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>
> <gfid:053e2fb1-bc89-476e-a529-90dffa39963c>
>
> <removed to save scrolling>
>
> Status: Connected
> Number of entries: 118
>
> Brick tpc-cent-glus2-081017:/exp/b1/gv0
> <gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>
> <gfid:6d5ade20-8996-4de2-95d5-20ef98004742>
> <gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>
> <gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>
> <gfid:053e2fb1-bc89-476e-a529-90dffa39963c>
>
> <removed to save scrolling>
>
> Status: Connected
> Number of entries: 118
>
> Brick tpc-arbiter1-100617:/exp/b1/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-arbiter1-100617:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> <gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>
> <gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>
> <gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>
> <gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>
> <gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>
> <gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>
> <gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>
> <gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>
> <gfid:51ff7386-a550-4971-957c-b42c4d915e9f>
> <gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>
> <gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>
> <gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>
> <gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>
> <gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>
> <gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>
> <gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>
> <gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>
> <gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>
> <gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>
> <gfid:6043e381-7edf-4569-bc37-e27dd13549d2>
> <gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>
> <gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>
> <gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>
> <gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>
> Status: Connected
> Number of entries: 24
>
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> <gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>
> <gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>
> <gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>
> <gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>
> <gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>
> <gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>
> <gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>
> <gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>
> <gfid:51ff7386-a550-4971-957c-b42c4d915e9f>
> <gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>
> <gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>
> <gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>
> <gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>
> <gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>
> <gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>
> <gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>
> <gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>
> <gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>
> <gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>
> <gfid:6043e381-7edf-4569-bc37-e27dd13549d2>
> <gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>
> <gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>
> <gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>
> <gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>
> Status: Connected
> Number of entries: 24
>
> Brick tpc-arbiter1-100617:/exp/b4/gv0
> Status: Connected
> Number of entries: 0
>
> Thank you for your help!
>
> From: Karthik Subrahmanya [mailto:mailto:ksubrahm at redhat.com]
> Sent: Monday, October 16, 2017 10:27 AM
> To: Matt Waymack <mailto:mwaymack at nsgdv.com>
> Cc: gluster-users <mailto:Gluster-users at gluster.org>
> Subject: Re: [Gluster-users] gfid entries in volume heal info that do not
> heal
>
> Hi Matt,
>
> The files might be in split brain. Could you please send the outputs of
> these?
> gluster volume info <volname>
> gluster volume heal <volname> info
> And also the getfattr output of the files which are in the heal info
> output from all the bricks of that replica pair.
> getfattr -d -e hex -m . <file path on brick>
>
> Thanks &  Regards
> Karthik
>
> On 16-Oct-2017 8:16 PM, "Matt Waymack" <mailto:mwaymack at nsgdv.com> wrote:
> Hi all,
>
> I have a volume where the output of volume heal info shows several gfid
> entries to be healed, but they’ve been there for weeks and have not
> healed.  Any normal file that shows up on the heal info does get healed as
> expected, but these gfid entries do not.  Is there any way to remove these
> orphaned entries from the volume so they are no longer stuck in the heal
> process?
>
> Thank you!
>
> _______________________________________________
> Gluster-users mailing list
> mailto:Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171018/57d935f3/attachment.html>


More information about the Gluster-users mailing list