[Gluster-users] 3.7.13, index healing broken?

Pranith Kumar Karampuri pkarampu at redhat.com
Wed Jul 13 10:34:46 UTC 2016


On Wed, Jul 13, 2016 at 3:09 PM, Dmitry Melekhov <dm at belkam.com> wrote:

> 13.07.2016 13:24, Pranith Kumar Karampuri пишет:
>
>
>
> On Wed, Jul 13, 2016 at 2:50 PM, Dmitry Melekhov < <dm at belkam.com>
> dm at belkam.com> wrote:
>
>> 13.07.2016 13:10, Pranith Kumar Karampuri пишет:
>>
>>
>>
>> On Wed, Jul 13, 2016 at 2:25 PM, Dmitry Melekhov < <dm at belkam.com>
>> dm at belkam.com> wrote:
>>
>>> 13.07.2016 11:40, Pranith Kumar Karampuri пишет:
>>>
>>>>
>>>> You recipe doesn't work :-(  If there is difference between bricks
>>>> directories due to direct brick manipulation it leads to problems.
>>>>
>>>> You have to execute "gluster volume heal <volname> full" for triggering
>>>> full heal.
>>>>
>>>> yeah, but I need to know that I need to execute it.
>>> any help from gluster or only external script?
>>>
>>>
>>> I guess it is not too difficult to set up cron/systemd.timer to run this
>> command once in a while right?
>>
>>
>> Too difficult? No.
>> So you are suggesting to run heal full by cron? Right?
>> Really, I don't know how much resources this full heal may need in real
>> installations.
>> If not much- why self-heal doesn't call it?
>>
>
> Because we don't expect people to touch the bricks. For a corner case it
> doesn't make sense to keep doing full filesystem scan. But we do provide
> the CLI for people who want it.
>
>
>
> Well, why run heal every 10 minutes if no problems are expected?
>

What we realized is that sometimes people run into space/quota exceeded
problems which lead to pending heals so it is better to run index heal once
every some minutes.


> From your link:
>
> The index heal is done:
> a) Every 600 seconds (can be changed via the cluster.heal-timeout volume
> option)
> b) When it is explicitly triggered via the gluster vol heal <VOLNAME>
> command
> c) Whenever a replica brick that was down comes back up.
>
> As I can understand, this index heal runs once per volume, not on specific
> node, this is why there is self-heal daemon,
> otherwise this can be achieved by cron. If I have node with cron down,
> then I'll get no full heal, I can, definitely, run next full heal on
> different node by cron :-)
>
>
>
>> What script do you need to write? I didn't get you.
>>
>>
>> Which compares bricks directories, and, if it there is real need- it
>> alerts me, I can run heal full or, may be, just trigger files heal by
>> reading some files over fuse.
>> Could you , please, tell me how heal full works and why it is not part of
>> self-heal process?
>>
>
> You can read more about it at:
> <https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/afr-self-heal-daemon.md>
> https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/afr-self-heal-daemon.md
>
>
> Thank you!
> I think it will be wise to add full heal interval to self-heal daemon.
>

This may not be a bad idea. Want to raise an RFE bug?


>
>
>
>>
>> Thank you!
>>
>>
>> --
>> Pranith
>>
>>
>>
>
>
> --
> Pranith
>
>
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/c3badf9c/attachment.html>


More information about the Gluster-users mailing list