[Gluster-users] gluster 5.6: Gfid mismatch detected

Hu Bert revirii at googlemail.com
Wed May 22 07:59:13 UTC 2019


Hi Ravi,

mount path of the volume is /shared/public, so complete paths are
/shared/public/staticmap/120/710/ and
/shared/public/staticmap/120/710/120710351/ .

getfattr -n glusterfs.gfid.string /shared/public/staticmap/120/710/
getfattr: Removing leading '/' from absolute path names
# file: shared/public/staticmap/120/710/
glusterfs.gfid.string="751233b0-7789-4550-bd95-4dd9c8f57c19"

getfattr -n glusterfs.gfid.string /shared/public/staticmap/120/710/120710351/
getfattr: Removing leading '/' from absolute path names
# file: shared/public/staticmap/120/710/120710351/
glusterfs.gfid.string="eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace"

So that fits. It somehow took a couple of attempts to resolve this,
and none of the commands seem to have "officially" succeeded:

gluster3 (host with the "fail"):
gluster volume heal workdata split-brain source-brick
gluster1:/gluster/md4/workdata
/shared/public/staticmap/120/710/120710351/
Lookup failed on /shared/public/staticmap/120/710:No such file or directory
Volume heal failed.

gluster1 ("good" host):
gluster volume heal workdata split-brain source-brick
gluster1:/gluster/md4/workdata
/shared/public/staticmap/120/710/120710351/
Lookup failed on /shared/public/staticmap/120/710:No such file or directory
Volume heal failed.

Only in the logs i see:

[2019-05-22 07:42:22.004182] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-workdata-replicate-0: performing metadata selfheal on
eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace
[2019-05-22 07:42:22.008502] I [MSGID: 108026]
[afr-self-heal-common.c:1729:afr_log_selfheal] 0-workdata-replicate-0:
Completed metadata selfheal on eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace.
sources=0 [1]  sinks=2

And via "gluster volume heal workdata statistics heal-count" there are
0 entries left. Files/directories are there. Happened the first time
with this setup, but everything ok now.

Thx for your fast help :-)


Hubert

Am Mi., 22. Mai 2019 um 09:32 Uhr schrieb Ravishankar N
<ravishankar at redhat.com>:
>
>
> On 22/05/19 12:39 PM, Hu Bert wrote:
> > Hi @ll,
> >
> > today i updated and rebooted the 3 servers of my replicate 3 setup;
> > after the 3rd one came up again i noticed this error:
> >
> > [2019-05-22 06:41:26.781165] E [MSGID: 108008]
> > [afr-self-heal-common.c:392:afr_gfid_split_brain_source]
> > 0-workdata-replicate-0: Gfid mismatch detected for
> > <gfid:751233b0-7789-4550-bd95-4dd9c8f57c19>/120710351>,
> > 82025ab3-8034-4257-9628-d8ebde909629 on workdata-client-2 and
> > eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace on workdata-client-1.
>
> 120710351 seems to be the entry that is in split-brain. Is
> /staticmap/120/710/120710351 the complete path to that entry? (check if
> gfid:751233b0-7789-4550-bd95-4dd9c8f57c19 corresponds to the gfid of 710).
>
> You can then try "gluster volume heal workdata split-brain source-brick
> gluster1:/gluster/md4/workdata /staticmap/120/710/120710351"
>
> -Ravi
>
> > [2019-05-22 06:41:27.069969] W [MSGID: 108027]
> > [afr-common.c:2270:afr_attempt_readsubvol_set] 0-workdata-replicate-0:
> > no read subvols for /staticmap/120/710/120710351
> > [2019-05-22 06:41:27.808532] W [fuse-bridge.c:582:fuse_entry_cbk]
> > 0-glusterfs-fuse: 1834335: LOOKUP() /staticmap/120/710/120710351 => -1
> > (Transport endpoint is not connected)
> >
> > A simple 'gluster volume heal workdata' didn't help; 'gluster volume
> > heal workdata info' says:
> >
> > Brick gluster1:/gluster/md4/workdata
> > /staticmap/120/710
> > /staticmap/120/710/120710351
> > <gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b>
> > Status: Connected
> > Number of entries: 3
> >
> > Brick gluster2:/gluster/md4/workdata
> > /staticmap/120/710
> > /staticmap/120/710/120710351
> > <gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b>
> > Status: Connected
> > Number of entries: 3
> >
> > Brick gluster3:/gluster/md4/workdata
> > /staticmap/120/710/120710351
> > Status: Connected
> > Number of entries: 1
> >
> > There's a mismatch in one directory; I tried to follow these instructions:
> > https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
> >
> > gluster volume heal workdata split-brain source-brick
> > gluster1:/gluster/md4/workdata
> > gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b
> > Healing gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b failed: File not in
> > split-brain.
> > Volume heal failed.
>
> >
> > Is there any other documentation for gfid mismatch and how to resolve this?
> >
> >
> > Thx,
> > Hubert
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list