[Gluster-users] 3.7.13, index healing broken?

Pranith Kumar Karampuri pkarampu at redhat.com
Wed Jul 13 05:36:36 UTC 2016


On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov <dm at belkam.com> wrote:

> 13.07.2016 09:26, Pranith Kumar Karampuri пишет:
>
>
>
> On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov < <dm at belkam.com>
> dm at belkam.com> wrote:
>
>> 13.07.2016 09:16, Pranith Kumar Karampuri пишет:
>>
>>
>>
>> On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov <dm at belkam.com> wrote:
>>
>>> 13.07.2016 09:04, Pranith Kumar Karampuri пишет:
>>>
>>>
>>>
>>> On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov < <dm at belkam.com>
>>> dm at belkam.com> wrote:
>>>
>>>> 13.07.2016 08:56, Pranith Kumar Karampuri пишет:
>>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov < <dm at belkam.com>
>>>> dm at belkam.com> wrote:
>>>>
>>>>> 13.07.2016 08:46, Pranith Kumar Karampuri пишет:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov < <dm at belkam.com>
>>>>> dm at belkam.com> wrote:
>>>>>
>>>>>> 13.07.2016 08:36, Pranith Kumar Karampuri пишет:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry Melekhov < <dm at belkam.com>
>>>>>> dm at belkam.com> wrote:
>>>>>>
>>>>>>> 13.07.2016 01:52, Anuradha Talur пишет:
>>>>>>>
>>>>>>>>
>>>>>>>> ----- Original Message -----
>>>>>>>>
>>>>>>>>> From: "Dmitry Melekhov" < <dm at belkam.com>dm at belkam.com>
>>>>>>>>> To: "Pranith Kumar Karampuri" < <pkarampu at redhat.com>
>>>>>>>>> pkarampu at redhat.com>
>>>>>>>>> Cc: "gluster-users" < <gluster-users at gluster.org>
>>>>>>>>> gluster-users at gluster.org>
>>>>>>>>> Sent: Tuesday, July 12, 2016 9:27:17 PM
>>>>>>>>> Subject: Re: [Gluster-users] 3.7.13, index healing broken?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 12.07.2016 17:39, Pranith Kumar Karampuri пишет:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Wow, what are the steps to recreate the problem?
>>>>>>>>>
>>>>>>>>> just set file length to zero, always reproducible.
>>>>>>>>>
>>>>>>>>> If you are setting the file length to 0 on one of the bricks
>>>>>>>> (looks like
>>>>>>>> that is the case), it is not a bug.
>>>>>>>>
>>>>>>>> Index heal relies on failures seen from the mount point(s)
>>>>>>>> to identify the files that need heal. It won't be able to recognize
>>>>>>>> any file
>>>>>>>> modification done directly on bricks. Same goes for heal info
>>>>>>>> command which
>>>>>>>> is the reason heal info also shows 0 entries.
>>>>>>>>
>>>>>>>
>>>>>>> Well, this makes self-heal useless then- if any file is accidently
>>>>>>> corrupted or deleted (yes! if file is deleted directly from brick this is
>>>>>>> no recognized by idex heal too), then it will not be self-healed, because
>>>>>>> self-heal uses index heal.
>>>>>>>
>>>>>>
>>>>>> It is better to look into bit-rot feature if you want to guard
>>>>>> against these kinds of problems.
>>>>>>
>>>>>>
>>>>>> Bit rot detects bit problems, not missing files or their wrong
>>>>>> length, i.e. this is overhead for such simple task.
>>>>>>
>>>>>
>>>>> It detects wrong length. Because checksum won't match anymore.
>>>>>
>>>>>
>>>>> Yes, sure. I guess that it will detect missed files too. But it needs
>>>>> far more resources, then just comparing directories in bricks?
>>>>>
>>>>>
>>>>> What use-case you are trying out is leading to changing things
>>>>> directly on the brick?
>>>>>
>>>>> I'm trying to test gluster failure tolerance and right now I'm not
>>>>> happy with it...
>>>>>
>>>>
>>>> Which cases of fault tolerance are you not happy with? Making changes
>>>> directly on the brick or anything else as well?
>>>>
>>>> I'll repeat:
>>>> As I already said- if I for some reason ( real case  can be only by
>>>> accident ) will delete file this will not be detected by self-heal daemon,
>>>> and, thus, will lead to lower replication level, i.e. lower failure
>>>> tolerance.
>>>>
>>>
>>> To prevent such accidents you need to set selinux policies so that files
>>> under the brick are not modified by accident by any user. At least that is
>>> the solution I remember when this was discussed 3-4 years back.
>>>
>>> So only supported platfrom is linux? Or, may be, it is better to improve
>>> self-healing to detect missing or wrong length files, I guess this is very
>>> low cost in terms of host resources operation.
>>> Just a suggestion, may be we need to look to alternatives in near
>>> future....
>>>
>>> This is a corner case, from design perspective it is generally not a
>> good idea to optimize for the corner case. It is better to protect
>> ourselves from the corner case (SElinux etc) or you can also use snapshots
>> to protect against these kind of mishaps.
>>
>> Sorry, I'm not agree.
>> As you  know if on access missed or wrong lenghted file from fuse client
>> it is restored (healed), i.e. gluster recognizes file is wrong and heal it
>> , so I do not see any reason to provide this such function as self-healing.
>> Thank you!
>>
>> Ah! Now how do you suggest we keep track of which of 10s of millions of
> files the user accidentally deleted from the brick without gluster's
> knowledge? Once it comes to gluster's knowledge we can do something. But
> how does gluster become aware of something it is not keeping track of? At
> the time you access it gluster knows something went wrong so it restores
> it. If you change something on the bricks even by accident all the data
> gluster keeps (similar to journal) is a waste. Even the disk filesystems
> will ask you to do fsck if something unexpected happens so full self-heal
> is similar operation.
>
>
> You are absolutely right- question is why gluster does not become aware
> about such problem is case of self-healing?
>

Because the operations that are performed directly on brick do not go
through gluster stack.


>
>
>
> --
> Pranith
>
>
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/a4f20325/attachment.html>


More information about the Gluster-users mailing list