[Gluster-users] Unable to self-heal contents of '<gfid:00000000-0000-0000-0000-000000000001>'
Ravishankar N
ravishankar at redhat.com
Mon Nov 25 11:22:14 UTC 2013
On 11/25/2013 01:47 AM, Mark Ruys wrote:
> So I decided to bite the bullet and upgraded from 3.3 to 3.4. Somehow
> this was a painful proces for me (the glusterfs daemon refused to
> start), so I decided to configure our Gluster pool from scratch.
> Everything seems to work nicely, except for the self-heal daemon. In
> the logs, I get every 10 minutes the following line:
>
> [2013-11-24 19:50:34.495204] E
> [afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
> 0-GLUSTER-SHARE-replicate-0: Unable to self-heal contents of
> '<gfid:00000000-0000-0000-0000-000000000001>' (possible split-brain).
> Please delete the file from all but the preferred subvolume.- Pending
> matrix: [ [ 0 2 ] [ 2 0 ] ]
>
>
> I've removed and recreated
> the .glusterfs/00/00/00000000-0000-0000-0000-000000000001, but that
> doesn't seem to make a difference.
>
> How to fix the self-heal daemon?
>
> Mark
>
> # find . -name 00000000-0000-0000-0000-000000000001 -ls
>
> 1447202 0 ---------- 2 root root 0 Nov 23 22:35
> ./export-share-1/.glusterfs/indices/xattrop/00000000-0000-0000-0000-000000000001
>
> 1319116 0 lrwxrwxrwx 1 root root 8 Nov 23 22:35
> ./export-share-1/.glusterfs/00/00/00000000-0000-0000-0000-000000000001
> -> ../../..
>
>
> Brick 1:
>
> # getfattr -m . -d -e hex export-share-1
>
> # file: export-share-1
>
> trusted.afr.GLUSTER-SHARE-client-0=0x000000000000000000000000
>
> trusted.afr.GLUSTER-SHARE-client-1=0x000000000000000200000000
>
> trusted.gfid=0x00000000000000000000000000000001
>
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> trusted.glusterfs.quota.dirty=0x3000
>
> trusted.glusterfs.quota.size=0x0000000000000000
>
> trusted.glusterfs.volume-id=0xe6eb05aabe3b456cbf3027275faa529c
>
>
> Brick 2:
>
> # getfattr -m . -d -e hex export-share-2
>
> # file: export-share-2
>
> trusted.afr.GLUSTER-SHARE-client-0=0x000000000000000200000000
>
> trusted.afr.GLUSTER-SHARE-client-1=0x000000000000000000000000
>
> trusted.gfid=0x00000000000000000000000000000001
>
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> trusted.glusterfs.quota.dirty=0x3000
>
> trusted.glusterfs.quota.size=0x0000000000000000
>
> trusted.glusterfs.volume-id=0xe6eb05aabe3b456cbf3027275faa529c
>
>
From the afr extended attributes, it seems you have hit a
metadata-split-brain of the top level (brick) directory (having gfid
01). If you are you able to perform I/O on all files from the mount
point without error (EIO) and the file contents are identical on both
the bricks (check with md5sum), you could safely clear the afr extended
attributes of the bricks:
setfattr -n trusted.afr.GLUSTER-SHARE-client-0 -v
0x000000000000000000000000 /export-share-1
setfattr -n trusted.afr.GLUSTER-SHARE-client-1 -v
0x000000000000000000000000 /export-share-1
setfattr -n trusted.afr.GLUSTER-SHARE-client-0 -v
0x000000000000000000000000 /export-share-2
setfattr -n trusted.afr.GLUSTER-SHARE-client-1 -v
0x000000000000000000000000 /export-share-2
Thanks,
Ravi
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131125/dec3ac0d/attachment.html>
More information about the Gluster-users
mailing list