[Gluster-users] self-heal not working

Ravishankar N ravishankar at redhat.com
Tue Aug 22 04:26:18 UTC 2017


Explore the following:

- Launch index heal and look at the glustershd logs of all bricks for 
possible errors

- See if the glustershd in each node is connected to all bricks.

- If not try to restart shd by `volume start force`

- Launch index heal again and try.

- Try debugging the shd log by setting client-log-level to DEBUG 
temporarily.

On 08/22/2017 03:19 AM, mabi wrote:
> Sure, it doesn't look like a split brain based on the output:
>
> Brick node1.domain.tld:/data/myvolume/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick node2.domain.tld:/data/myvolume/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
> Status: Connected
> Number of entries in split-brain: 0
>
>
>
>
>> -------- Original Message --------
>> Subject: Re: [Gluster-users] self-heal not working
>> Local Time: August 21, 2017 11:35 PM
>> UTC Time: August 21, 2017 9:35 PM
>> From: bturner at redhat.com
>> To: mabi <mabi at protonmail.ch>
>> Gluster Users <gluster-users at gluster.org>
>>
>> Can you also provide:
>>
>> gluster v heal <my vol> info split-brain
>>
>> If it is split brain just delete the incorrect file from the brick 
>> and run heal again. I haven"t tried this with arbiter but I assume 
>> the process is the same.
>>
>> -b
>>
>> ----- Original Message -----
>> > From: "mabi" <mabi at protonmail.ch>
>> > To: "Ben Turner" <bturner at redhat.com>
>> > Cc: "Gluster Users" <gluster-users at gluster.org>
>> > Sent: Monday, August 21, 2017 4:55:59 PM
>> > Subject: Re: [Gluster-users] self-heal not working
>> >
>> > Hi Ben,
>> >
>> > So it is really a 0 kBytes file everywhere (all nodes including the 
>> arbiter
>> > and from the client).
>> > Here below you will find the output you requested. Hopefully that 
>> will help
>> > to find out why this specific file is not healing... Let me know if 
>> you need
>> > any more information. Btw node3 is my arbiter node.
>> >
>> > NODE1:
>> >
>> > STAT:
>> > File:
>> > 
>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>> > Size: 0 Blocks: 38 IO Block: 131072 regular empty file
>> > Device: 24h/36d Inode: 10033884 Links: 2
>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> > Access: 2017-08-14 17:04:55.530681000 +0200
>> > Modify: 2017-08-14 17:11:46.407404779 +0200
>> > Change: 2017-08-14 17:11:46.407404779 +0200
>> > Birth: -
>> >
>> > GETFATTR:
>> > trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
>> > trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg==
>> > trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>> > 
>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo=
>> >
>> > NODE2:
>> >
>> > STAT:
>> > File:
>> > 
>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>> > Size: 0 Blocks: 38 IO Block: 131072 regular empty file
>> > Device: 26h/38d Inode: 10031330 Links: 2
>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> > Access: 2017-08-14 17:04:55.530681000 +0200
>> > Modify: 2017-08-14 17:11:46.403704181 +0200
>> > Change: 2017-08-14 17:11:46.403704181 +0200
>> > Birth: -
>> >
>> > GETFATTR:
>> > trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
>> > trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw==
>> > trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>> > 
>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE=
>> >
>> > NODE3:
>> > STAT:
>> > File:
>> > 
>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> > Size: 0 Blocks: 0 IO Block: 4096 regular empty file
>> > Device: ca11h/51729d Inode: 405208959 Links: 2
>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> > Access: 2017-08-14 17:04:55.530681000 +0200
>> > Modify: 2017-08-14 17:04:55.530681000 +0200
>> > Change: 2017-08-14 17:11:46.604380051 +0200
>> > Birth: -
>> >
>> > GETFATTR:
>> > trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
>> > trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg==
>> > trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>> > 
>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4=
>> >
>> > CLIENT GLUSTER MOUNT:
>> > STAT:
>> > File:
>> > 
>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png"
>> > Size: 0 Blocks: 0 IO Block: 131072 regular empty file
>> > Device: 1eh/30d Inode: 11897049013408443114 Links: 1
>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> > Access: 2017-08-14 17:04:55.530681000 +0200
>> > Modify: 2017-08-14 17:11:46.407404779 +0200
>> > Change: 2017-08-14 17:11:46.407404779 +0200
>> > Birth: -
>> >
>> > > -------- Original Message --------
>> > > Subject: Re: [Gluster-users] self-heal not working
>> > > Local Time: August 21, 2017 9:34 PM
>> > > UTC Time: August 21, 2017 7:34 PM
>> > > From: bturner at redhat.com
>> > > To: mabi <mabi at protonmail.ch>
>> > > Gluster Users <gluster-users at gluster.org>
>> > >
>> > > ----- Original Message -----
>> > >> From: "mabi" <mabi at protonmail.ch>
>> > >> To: "Gluster Users" <gluster-users at gluster.org>
>> > >> Sent: Monday, August 21, 2017 9:28:24 AM
>> > >> Subject: [Gluster-users] self-heal not working
>> > >>
>> > >> Hi,
>> > >>
>> > >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and 
>> there is
>> > >> currently one file listed to be healed as you can see below but 
>> never gets
>> > >> healed by the self-heal daemon:
>> > >>
>> > >> Brick node1.domain.tld:/data/myvolume/brick
>> > >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> > >> Status: Connected
>> > >> Number of entries: 1
>> > >>
>> > >> Brick node2.domain.tld:/data/myvolume/brick
>> > >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> > >> Status: Connected
>> > >> Number of entries: 1
>> > >>
>> > >> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
>> > >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> > >> Status: Connected
>> > >> Number of entries: 1
>> > >>
>> > >> As once recommended on this mailing list I have mounted that 
>> glusterfs
>> > >> volume
>> > >> temporarily through fuse/glusterfs and ran a "stat" on that file 
>> which is
>> > >> listed above but nothing happened.
>> > >>
>> > >> The file itself is available on all 3 nodes/bricks but on the 
>> last node it
>> > >> has a different date. By the way this file is 0 kBytes big. Is 
>> that maybe
>> > >> the reason why the self-heal does not work?
>> > >
>> > > Is the file actually 0 bytes or is it just 0 bytes on the 
>> arbiter(0 bytes
>> > > are expected on the arbiter, it just stores metadata)? Can you 
>> send us the
>> > > output from stat on all 3 nodes:
>> > >
>> > > $ stat <file on back end brick>
>> > > $ getfattr -d -m - <file on back end brick>
>> > > $ stat <file from gluster mount>
>> > >
>> > > Lets see what things look like on the back end, it should tell us why
>> > > healing is failing.
>> > >
>> > > -b
>> > >
>> > >>
>> > >> And how can I now make this file to heal?
>> > >>
>> > >> Thanks,
>> > >> Mabi
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> _______________________________________________
>> > >> Gluster-users mailing list
>> > >> Gluster-users at gluster.org
>> > >> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170822/59a1a684/attachment.html>


More information about the Gluster-users mailing list