[Gluster-users] 3.7.13, index healing broken?

Dmitry Melekhov dm at belkam.com
Wed Jul 13 05:28:13 UTC 2016


13.07.2016 09:26, Pranith Kumar Karampuri пишет:
>
>
> On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov <dm at belkam.com 
> <mailto:dm at belkam.com>> wrote:
>
>     13.07.2016 09:16, Pranith Kumar Karampuri пишет:
>>
>>
>>     On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov <dm at belkam.com
>>     <mailto:dm at belkam.com>> wrote:
>>
>>         13.07.2016 09:04, Pranith Kumar Karampuri пишет:
>>>
>>>
>>>         On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov
>>>         <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>
>>>             13.07.2016 08:56, Pranith Kumar Karampuri пишет:
>>>>
>>>>
>>>>             On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov
>>>>             <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>>
>>>>                 13.07.2016 08:46, Pranith Kumar Karampuri пишет:
>>>>>
>>>>>
>>>>>                 On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov
>>>>>                 <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>>>
>>>>>                     13.07.2016 08:36, Pranith Kumar Karampuri пишет:
>>>>>>
>>>>>>
>>>>>>                     On Wed, Jul 13, 2016 at 9:35 AM, Dmitry
>>>>>>                     Melekhov <dm at belkam.com
>>>>>>                     <mailto:dm at belkam.com>> wrote:
>>>>>>
>>>>>>                         13.07.2016 01:52, Anuradha Talur пишет:
>>>>>>
>>>>>>
>>>>>>                             ----- Original Message -----
>>>>>>
>>>>>>                                 From: "Dmitry Melekhov"
>>>>>>                                 <dm at belkam.com
>>>>>>                                 <mailto:dm at belkam.com>>
>>>>>>                                 To: "Pranith Kumar Karampuri"
>>>>>>                                 <pkarampu at redhat.com
>>>>>>                                 <mailto:pkarampu at redhat.com>>
>>>>>>                                 Cc: "gluster-users"
>>>>>>                                 <gluster-users at gluster.org
>>>>>>                                 <mailto:gluster-users at gluster.org>>
>>>>>>                                 Sent: Tuesday, July 12, 2016
>>>>>>                                 9:27:17 PM
>>>>>>                                 Subject: Re: [Gluster-users]
>>>>>>                                 3.7.13, index healing broken?
>>>>>>
>>>>>>
>>>>>>
>>>>>>                                 12.07.2016 17:39, Pranith Kumar
>>>>>>                                 Karampuri пишет:
>>>>>>
>>>>>>
>>>>>>
>>>>>>                                 Wow, what are the steps to
>>>>>>                                 recreate the problem?
>>>>>>
>>>>>>                                 just set file length to zero,
>>>>>>                                 always reproducible.
>>>>>>
>>>>>>                             If you are setting the file length to
>>>>>>                             0 on one of the bricks (looks like
>>>>>>                             that is the case), it is not a bug.
>>>>>>
>>>>>>                             Index heal relies on failures seen
>>>>>>                             from the mount point(s)
>>>>>>                             to identify the files that need heal.
>>>>>>                             It won't be able to recognize any file
>>>>>>                             modification done directly on bricks.
>>>>>>                             Same goes for heal info command which
>>>>>>                             is the reason heal info also shows 0
>>>>>>                             entries.
>>>>>>
>>>>>>
>>>>>>                         Well, this makes self-heal useless then-
>>>>>>                         if any file is accidently corrupted or
>>>>>>                         deleted (yes! if file is deleted directly
>>>>>>                         from brick this is no recognized by idex
>>>>>>                         heal too), then it will not be
>>>>>>                         self-healed, because self-heal uses index
>>>>>>                         heal.
>>>>>>
>>>>>>
>>>>>>                     It is better to look into bit-rot feature if
>>>>>>                     you want to guard against these kinds of
>>>>>>                     problems.
>>>>>
>>>>>                     Bit rot detects bit problems, not missing
>>>>>                     files or their wrong length, i.e. this is
>>>>>                     overhead for such simple task.
>>>>>
>>>>>
>>>>>                 It detects wrong length. Because checksum won't
>>>>>                 match anymore.
>>>>
>>>>                 Yes, sure. I guess that it will detect missed files
>>>>                 too. But it needs far more resources, then just
>>>>                 comparing directories in bricks?
>>>>>
>>>>>                 What use-case you are trying out is leading to
>>>>>                 changing things directly on the brick?
>>>>                 I'm trying to test gluster failure tolerance and
>>>>                 right now I'm not happy with it...
>>>>
>>>>
>>>>             Which cases of fault tolerance are you not happy with?
>>>>             Making changes directly on the brick or anything else
>>>>             as well?
>>>>
>>>             I'll repeat:
>>>             As I already said- if I for some reason ( real case  can
>>>             be only by accident ) will delete file this will not be
>>>             detected by self-heal daemon, and, thus, will lead to
>>>             lower replication level, i.e. lower failure tolerance.
>>>
>>>
>>>         To prevent such accidents you need to set selinux policies
>>>         so that files under the brick are not modified by accident
>>>         by any user. At least that is the solution I remember when
>>>         this was discussed 3-4 years back.
>>>
>>         So only supported platfrom is linux? Or, may be, it is better
>>         to improve self-healing to detect missing or wrong length
>>         files, I guess this is very low cost in terms of host
>>         resources operation.
>>         Just a suggestion, may be we need to look to alternatives in
>>         near future....
>>
>>     This is a corner case, from design perspective it is generally
>>     not a good idea to optimize for the corner case. It is better to
>>     protect ourselves from the corner case (SElinux etc) or you can
>>     also use snapshots to protect against these kind of mishaps.
>>
>     Sorry, I'm not agree.
>     As you  know if on access missed or wrong lenghted file from fuse
>     client it is restored (healed), i.e. gluster recognizes file is
>     wrong and heal it , so I do not see any reason to provide this
>     such function as self-healing.
>     Thank you!
>
> Ah! Now how do you suggest we keep track of which of 10s of millions 
> of files the user accidentally deleted from the brick without 
> gluster's knowledge? Once it comes to gluster's knowledge we can do 
> something. But how does gluster become aware of something it is not 
> keeping track of? At the time you access it gluster knows something 
> went wrong so it restores it. If you change something on the bricks 
> even by accident all the data gluster keeps (similar to journal) is a 
> waste. Even the disk filesystems will ask you to do fsck if something 
> unexpected happens so full self-heal is similar operation.

You are absolutely right- question is why gluster does not become aware 
about such problem is case of self-healing?

>
>
> -- 
> Pranith

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/1ef4cc13/attachment.html>


More information about the Gluster-users mailing list