[Gluster-users] 3.7.13, index healing broken?
Dmitry Melekhov
dm at belkam.com
Wed Jul 13 05:28:13 UTC 2016
13.07.2016 09:26, Pranith Kumar Karampuri пишет:
>
>
> On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov <dm at belkam.com
> <mailto:dm at belkam.com>> wrote:
>
> 13.07.2016 09:16, Pranith Kumar Karampuri пишет:
>>
>>
>> On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov <dm at belkam.com
>> <mailto:dm at belkam.com>> wrote:
>>
>> 13.07.2016 09:04, Pranith Kumar Karampuri пишет:
>>>
>>>
>>> On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov
>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>
>>> 13.07.2016 08:56, Pranith Kumar Karampuri пишет:
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov
>>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>>
>>>> 13.07.2016 08:46, Pranith Kumar Karampuri пишет:
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov
>>>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>>>
>>>>> 13.07.2016 08:36, Pranith Kumar Karampuri пишет:
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry
>>>>>> Melekhov <dm at belkam.com
>>>>>> <mailto:dm at belkam.com>> wrote:
>>>>>>
>>>>>> 13.07.2016 01:52, Anuradha Talur пишет:
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>>
>>>>>> From: "Dmitry Melekhov"
>>>>>> <dm at belkam.com
>>>>>> <mailto:dm at belkam.com>>
>>>>>> To: "Pranith Kumar Karampuri"
>>>>>> <pkarampu at redhat.com
>>>>>> <mailto:pkarampu at redhat.com>>
>>>>>> Cc: "gluster-users"
>>>>>> <gluster-users at gluster.org
>>>>>> <mailto:gluster-users at gluster.org>>
>>>>>> Sent: Tuesday, July 12, 2016
>>>>>> 9:27:17 PM
>>>>>> Subject: Re: [Gluster-users]
>>>>>> 3.7.13, index healing broken?
>>>>>>
>>>>>>
>>>>>>
>>>>>> 12.07.2016 17:39, Pranith Kumar
>>>>>> Karampuri пишет:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Wow, what are the steps to
>>>>>> recreate the problem?
>>>>>>
>>>>>> just set file length to zero,
>>>>>> always reproducible.
>>>>>>
>>>>>> If you are setting the file length to
>>>>>> 0 on one of the bricks (looks like
>>>>>> that is the case), it is not a bug.
>>>>>>
>>>>>> Index heal relies on failures seen
>>>>>> from the mount point(s)
>>>>>> to identify the files that need heal.
>>>>>> It won't be able to recognize any file
>>>>>> modification done directly on bricks.
>>>>>> Same goes for heal info command which
>>>>>> is the reason heal info also shows 0
>>>>>> entries.
>>>>>>
>>>>>>
>>>>>> Well, this makes self-heal useless then-
>>>>>> if any file is accidently corrupted or
>>>>>> deleted (yes! if file is deleted directly
>>>>>> from brick this is no recognized by idex
>>>>>> heal too), then it will not be
>>>>>> self-healed, because self-heal uses index
>>>>>> heal.
>>>>>>
>>>>>>
>>>>>> It is better to look into bit-rot feature if
>>>>>> you want to guard against these kinds of
>>>>>> problems.
>>>>>
>>>>> Bit rot detects bit problems, not missing
>>>>> files or their wrong length, i.e. this is
>>>>> overhead for such simple task.
>>>>>
>>>>>
>>>>> It detects wrong length. Because checksum won't
>>>>> match anymore.
>>>>
>>>> Yes, sure. I guess that it will detect missed files
>>>> too. But it needs far more resources, then just
>>>> comparing directories in bricks?
>>>>>
>>>>> What use-case you are trying out is leading to
>>>>> changing things directly on the brick?
>>>> I'm trying to test gluster failure tolerance and
>>>> right now I'm not happy with it...
>>>>
>>>>
>>>> Which cases of fault tolerance are you not happy with?
>>>> Making changes directly on the brick or anything else
>>>> as well?
>>>>
>>> I'll repeat:
>>> As I already said- if I for some reason ( real case can
>>> be only by accident ) will delete file this will not be
>>> detected by self-heal daemon, and, thus, will lead to
>>> lower replication level, i.e. lower failure tolerance.
>>>
>>>
>>> To prevent such accidents you need to set selinux policies
>>> so that files under the brick are not modified by accident
>>> by any user. At least that is the solution I remember when
>>> this was discussed 3-4 years back.
>>>
>> So only supported platfrom is linux? Or, may be, it is better
>> to improve self-healing to detect missing or wrong length
>> files, I guess this is very low cost in terms of host
>> resources operation.
>> Just a suggestion, may be we need to look to alternatives in
>> near future....
>>
>> This is a corner case, from design perspective it is generally
>> not a good idea to optimize for the corner case. It is better to
>> protect ourselves from the corner case (SElinux etc) or you can
>> also use snapshots to protect against these kind of mishaps.
>>
> Sorry, I'm not agree.
> As you know if on access missed or wrong lenghted file from fuse
> client it is restored (healed), i.e. gluster recognizes file is
> wrong and heal it , so I do not see any reason to provide this
> such function as self-healing.
> Thank you!
>
> Ah! Now how do you suggest we keep track of which of 10s of millions
> of files the user accidentally deleted from the brick without
> gluster's knowledge? Once it comes to gluster's knowledge we can do
> something. But how does gluster become aware of something it is not
> keeping track of? At the time you access it gluster knows something
> went wrong so it restores it. If you change something on the bricks
> even by accident all the data gluster keeps (similar to journal) is a
> waste. Even the disk filesystems will ask you to do fsck if something
> unexpected happens so full self-heal is similar operation.
You are absolutely right- question is why gluster does not become aware
about such problem is case of self-healing?
>
>
> --
> Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/1ef4cc13/attachment.html>
More information about the Gluster-users
mailing list