[Gluster-users] self-heal not working

Tue Aug 22 09:51:57 UTC 2017


On 08/22/2017 02:30 PM, mabi wrote:
> Thanks for the additional hints, I have the following 2 questions first:
>
> - In order to launch the index heal is the following command correct:
> gluster volume heal myvolume
>
Yes
> - If I run a "volume start force" will it have any short disruptions 
> on my clients which mount the volume through FUSE? If yes, how long? 
> This is a production system that's why I am asking.
>
>
No. You can actually create a test volume on  your personal linux box to 
try these kinds of things without needing multiple machines. This is how 
we develop and test our patches :)
'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} 
force` and so on.

HTH,
Ravi

>
>> -------- Original Message --------
>> Subject: Re: [Gluster-users] self-heal not working
>> Local Time: August 22, 2017 6:26 AM
>> UTC Time: August 22, 2017 4:26 AM
>> From: ravishankar at redhat.com
>> To: mabi <mabi at protonmail.ch>, Ben Turner <bturner at redhat.com>
>> Gluster Users <gluster-users at gluster.org>
>>
>>
>> Explore the following:
>>
>> - Launch index heal and look at the glustershd logs of all bricks for 
>> possible errors
>>
>> - See if the glustershd in each node is connected to all bricks.
>>
>> - If not try to restart shd by `volume start force`
>>
>> - Launch index heal again and try.
>>
>> - Try debugging the shd log by setting client-log-level to DEBUG 
>> temporarily.
>>
>>
>> On 08/22/2017 03:19 AM, mabi wrote:
>>> Sure, it doesn't look like a split brain based on the output:
>>>
>>> Brick node1.domain.tld:/data/myvolume/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick node2.domain.tld:/data/myvolume/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>>
>>>
>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [Gluster-users] self-heal not working
>>>> Local Time: August 21, 2017 11:35 PM
>>>> UTC Time: August 21, 2017 9:35 PM
>>>> From: bturner at redhat.com
>>>> To: mabi <mabi at protonmail.ch>
>>>> Gluster Users <gluster-users at gluster.org>
>>>>
>>>> Can you also provide:
>>>>
>>>> gluster v heal <my vol> info split-brain
>>>>
>>>> If it is split brain just delete the incorrect file from the brick 
>>>> and run heal again. I haven"t tried this with arbiter but I assume 
>>>> the process is the same.
>>>>
>>>> -b
>>>>
>>>> ----- Original Message -----
>>>> > From: "mabi" <mabi at protonmail.ch>
>>>> > To: "Ben Turner" <bturner at redhat.com>
>>>> > Cc: "Gluster Users" <gluster-users at gluster.org>
>>>> > Sent: Monday, August 21, 2017 4:55:59 PM
>>>> > Subject: Re: [Gluster-users] self-heal not working
>>>> >
>>>> > Hi Ben,
>>>> >
>>>> > So it is really a 0 kBytes file everywhere (all nodes including 
>>>> the arbiter
>>>> > and from the client).
>>>> > Here below you will find the output you requested. Hopefully that 
>>>> will help
>>>> > to find out why this specific file is not healing... Let me know 
>>>> if you need
>>>> > any more information. Btw node3 is my arbiter node.
>>>> >
>>>> > NODE1:
>>>> >
>>>> > STAT:
>>>> > File:
>>>> > 
>>>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>>>> > Size: 0 Blocks: 38 IO Block: 131072 regular empty file
>>>> > Device: 24h/36d Inode: 10033884 Links: 2
>>>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>>>> > Access: 2017-08-14 17:04:55.530681000 +0200
>>>> > Modify: 2017-08-14 17:11:46.407404779 +0200
>>>> > Change: 2017-08-14 17:11:46.407404779 +0200
>>>> > Birth: -
>>>> >
>>>> > GETFATTR:
>>>> > trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
>>>> > trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg==
>>>> > trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>>>> > 
>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo=
>>>> >
>>>> > NODE2:
>>>> >
>>>> > STAT:
>>>> > File:
>>>> > 
>>>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>>>> > Size: 0 Blocks: 38 IO Block: 131072 regular empty file
>>>> > Device: 26h/38d Inode: 10031330 Links: 2
>>>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>>>> > Access: 2017-08-14 17:04:55.530681000 +0200
>>>> > Modify: 2017-08-14 17:11:46.403704181 +0200
>>>> > Change: 2017-08-14 17:11:46.403704181 +0200
>>>> > Birth: -
>>>> >
>>>> > GETFATTR:
>>>> > trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
>>>> > trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw==
>>>> > trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>>>> > 
>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE=
>>>> >
>>>> > NODE3:
>>>> > STAT:
>>>> > File:
>>>> > 
>>>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>>>> > Size: 0 Blocks: 0 IO Block: 4096 regular empty file
>>>> > Device: ca11h/51729d Inode: 405208959 Links: 2
>>>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>>>> > Access: 2017-08-14 17:04:55.530681000 +0200
>>>> > Modify: 2017-08-14 17:04:55.530681000 +0200
>>>> > Change: 2017-08-14 17:11:46.604380051 +0200
>>>> > Birth: -
>>>> >
>>>> > GETFATTR:
>>>> > trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
>>>> > trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg==
>>>> > trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>>>> > 
>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4=
>>>> >
>>>> > CLIENT GLUSTER MOUNT:
>>>> > STAT:
>>>> > File:
>>>> > 
>>>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png"
>>>> > Size: 0 Blocks: 0 IO Block: 131072 regular empty file
>>>> > Device: 1eh/30d Inode: 11897049013408443114 Links: 1
>>>> > Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>>>> > Access: 2017-08-14 17:04:55.530681000 +0200
>>>> > Modify: 2017-08-14 17:11:46.407404779 +0200
>>>> > Change: 2017-08-14 17:11:46.407404779 +0200
>>>> > Birth: -
>>>> >
>>>> > > -------- Original Message --------
>>>> > > Subject: Re: [Gluster-users] self-heal not working
>>>> > > Local Time: August 21, 2017 9:34 PM
>>>> > > UTC Time: August 21, 2017 7:34 PM
>>>> > > From: bturner at redhat.com
>>>> > > To: mabi <mabi at protonmail.ch>
>>>> > > Gluster Users <gluster-users at gluster.org>
>>>> > >
>>>> > > ----- Original Message -----
>>>> > >> From: "mabi" <mabi at protonmail.ch>
>>>> > >> To: "Gluster Users" <gluster-users at gluster.org>
>>>> > >> Sent: Monday, August 21, 2017 9:28:24 AM
>>>> > >> Subject: [Gluster-users] self-heal not working
>>>> > >>
>>>> > >> Hi,
>>>> > >>
>>>> > >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and 
>>>> there is
>>>> > >> currently one file listed to be healed as you can see below 
>>>> but never gets
>>>> > >> healed by the self-heal daemon:
>>>> > >>
>>>> > >> Brick node1.domain.tld:/data/myvolume/brick
>>>> > >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>>>> > >> Status: Connected
>>>> > >> Number of entries: 1
>>>> > >>
>>>> > >> Brick node2.domain.tld:/data/myvolume/brick
>>>> > >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>>>> > >> Status: Connected
>>>> > >> Number of entries: 1
>>>> > >>
>>>> > >> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
>>>> > >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>>>> > >> Status: Connected
>>>> > >> Number of entries: 1
>>>> > >>
>>>> > >> As once recommended on this mailing list I have mounted that 
>>>> glusterfs
>>>> > >> volume
>>>> > >> temporarily through fuse/glusterfs and ran a "stat" on that 
>>>> file which is
>>>> > >> listed above but nothing happened.
>>>> > >>
>>>> > >> The file itself is available on all 3 nodes/bricks but on the 
>>>> last node it
>>>> > >> has a different date. By the way this file is 0 kBytes big. Is 
>>>> that maybe
>>>> > >> the reason why the self-heal does not work?
>>>> > >
>>>> > > Is the file actually 0 bytes or is it just 0 bytes on the 
>>>> arbiter(0 bytes
>>>> > > are expected on the arbiter, it just stores metadata)? Can you 
>>>> send us the
>>>> > > output from stat on all 3 nodes:
>>>> > >
>>>> > > $ stat <file on back end brick>
>>>> > > $ getfattr -d -m - <file on back end brick>
>>>> > > $ stat <file from gluster mount>
>>>> > >
>>>> > > Lets see what things look like on the back end, it should tell 
>>>> us why
>>>> > > healing is failing.
>>>> > >
>>>> > > -b
>>>> > >
>>>> > >>
>>>> > >> And how can I now make this file to heal?
>>>> > >>
>>>> > >> Thanks,
>>>> > >> Mabi
>>>> > >>
>>>> > >>
>>>> > >>
>>>> > >>
>>>> > >> _______________________________________________
>>>> > >> Gluster-users mailing list
>>>> > >> Gluster-users at gluster.org
>>>> > >> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170822/4641fbb3/attachment.html>