[Gluster-users] 3.7.13, index healing broken?

Dmitry Melekhov dm at belkam.com
Wed Jul 13 05:41:02 UTC 2016


13.07.2016 09:36, Pranith Kumar Karampuri пишет:
>
>
> On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov <dm at belkam.com 
> <mailto:dm at belkam.com>> wrote:
>
>     13.07.2016 09:26, Pranith Kumar Karampuri пишет:
>>
>>
>>     On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov <dm at belkam.com
>>     <mailto:dm at belkam.com>> wrote:
>>
>>         13.07.2016 09:16, Pranith Kumar Karampuri пишет:
>>>
>>>
>>>         On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov
>>>         <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>
>>>             13.07.2016 09:04, Pranith Kumar Karampuri пишет:
>>>>
>>>>
>>>>             On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov
>>>>             <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>>
>>>>                 13.07.2016 08:56, Pranith Kumar Karampuri пишет:
>>>>>
>>>>>
>>>>>                 On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov
>>>>>                 <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>>>
>>>>>                     13.07.2016 08:46, Pranith Kumar Karampuri пишет:
>>>>>>
>>>>>>
>>>>>>                     On Wed, Jul 13, 2016 at 10:10 AM, Dmitry
>>>>>>                     Melekhov <dm at belkam.com
>>>>>>                     <mailto:dm at belkam.com>> wrote:
>>>>>>
>>>>>>                         13.07.2016 08:36, Pranith Kumar Karampuri
>>>>>>                         пишет:
>>>>>>>
>>>>>>>
>>>>>>>                         On Wed, Jul 13, 2016 at 9:35 AM, Dmitry
>>>>>>>                         Melekhov <dm at belkam.com
>>>>>>>                         <mailto:dm at belkam.com>> wrote:
>>>>>>>
>>>>>>>                             13.07.2016 01:52, Anuradha Talur пишет:
>>>>>>>
>>>>>>>
>>>>>>>                                 ----- Original Message -----
>>>>>>>
>>>>>>>                                     From: "Dmitry Melekhov"
>>>>>>>                                     <dm at belkam.com
>>>>>>>                                     <mailto:dm at belkam.com>>
>>>>>>>                                     To: "Pranith Kumar
>>>>>>>                                     Karampuri"
>>>>>>>                                     <pkarampu at redhat.com
>>>>>>>                                     <mailto:pkarampu at redhat.com>>
>>>>>>>                                     Cc: "gluster-users"
>>>>>>>                                     <gluster-users at gluster.org
>>>>>>>                                     <mailto:gluster-users at gluster.org>>
>>>>>>>                                     Sent: Tuesday, July 12, 2016
>>>>>>>                                     9:27:17 PM
>>>>>>>                                     Subject: Re: [Gluster-users]
>>>>>>>                                     3.7.13, index healing broken?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                                     12.07.2016 17:39, Pranith
>>>>>>>                                     Kumar Karampuri пишет:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                                     Wow, what are the steps to
>>>>>>>                                     recreate the problem?
>>>>>>>
>>>>>>>                                     just set file length to
>>>>>>>                                     zero, always reproducible.
>>>>>>>
>>>>>>>                                 If you are setting the file
>>>>>>>                                 length to 0 on one of the bricks
>>>>>>>                                 (looks like
>>>>>>>                                 that is the case), it is not a bug.
>>>>>>>
>>>>>>>                                 Index heal relies on failures
>>>>>>>                                 seen from the mount point(s)
>>>>>>>                                 to identify the files that need
>>>>>>>                                 heal. It won't be able to
>>>>>>>                                 recognize any file
>>>>>>>                                 modification done directly on
>>>>>>>                                 bricks. Same goes for heal info
>>>>>>>                                 command which
>>>>>>>                                 is the reason heal info also
>>>>>>>                                 shows 0 entries.
>>>>>>>
>>>>>>>
>>>>>>>                             Well, this makes self-heal useless
>>>>>>>                             then- if any file is accidently
>>>>>>>                             corrupted or deleted (yes! if file
>>>>>>>                             is deleted directly from brick this
>>>>>>>                             is no recognized by idex heal too),
>>>>>>>                             then it will not be self-healed,
>>>>>>>                             because self-heal uses index heal.
>>>>>>>
>>>>>>>
>>>>>>>                         It is better to look into bit-rot
>>>>>>>                         feature if you want to guard against
>>>>>>>                         these kinds of problems.
>>>>>>
>>>>>>                         Bit rot detects bit problems, not missing
>>>>>>                         files or their wrong length, i.e. this is
>>>>>>                         overhead for such simple task.
>>>>>>
>>>>>>
>>>>>>                     It detects wrong length. Because checksum
>>>>>>                     won't match anymore.
>>>>>
>>>>>                     Yes, sure. I guess that it will detect missed
>>>>>                     files too. But it needs far more resources,
>>>>>                     then just comparing directories in bricks?
>>>>>>
>>>>>>                     What use-case you are trying out is leading
>>>>>>                     to changing things directly on the brick?
>>>>>                     I'm trying to test gluster failure tolerance
>>>>>                     and right now I'm not happy with it...
>>>>>
>>>>>
>>>>>                 Which cases of fault tolerance are you not happy
>>>>>                 with? Making changes directly on the brick or
>>>>>                 anything else as well?
>>>>>
>>>>                 I'll repeat:
>>>>                 As I already said- if I for some reason ( real
>>>>                 case  can be only by accident ) will delete file
>>>>                 this will not be detected by self-heal daemon, and,
>>>>                 thus, will lead to lower replication level, i.e.
>>>>                 lower failure tolerance.
>>>>
>>>>
>>>>             To prevent such accidents you need to set selinux
>>>>             policies so that files under the brick are not modified
>>>>             by accident by any user. At least that is the solution
>>>>             I remember when this was discussed 3-4 years back.
>>>>
>>>             So only supported platfrom is linux? Or, may be, it is
>>>             better to improve self-healing to detect missing or
>>>             wrong length files, I guess this is very low cost in
>>>             terms of host resources operation.
>>>             Just a suggestion, may be we need to look to
>>>             alternatives in near future....
>>>
>>>         This is a corner case, from design perspective it is
>>>         generally not a good idea to optimize for the corner case.
>>>         It is better to protect ourselves from the corner case
>>>         (SElinux etc) or you can also use snapshots to protect
>>>         against these kind of mishaps.
>>>
>>         Sorry, I'm not agree.
>>         As you  know if on access missed or wrong lenghted file from
>>         fuse client it is restored (healed), i.e. gluster recognizes
>>         file is wrong and heal it , so I do not see any reason to
>>         provide this such function as self-healing.
>>         Thank you!
>>
>>     Ah! Now how do you suggest we keep track of which of 10s of
>>     millions of files the user accidentally deleted from the brick
>>     without gluster's knowledge? Once it comes to gluster's knowledge
>>     we can do something. But how does gluster become aware of
>>     something it is not keeping track of? At the time you access it
>>     gluster knows something went wrong so it restores it. If you
>>     change something on the bricks even by accident all the data
>>     gluster keeps (similar to journal) is a waste. Even the disk
>>     filesystems will ask you to do fsck if something unexpected
>>     happens so full self-heal is similar operation.
>
>     You are absolutely right- question is why gluster does not become
>     aware about such problem is case of self-healing?
>
>
> Because the operations that are performed directly on brick do not go 
> through gluster stack.

OK, I'll repeat-
As you  know if on access missed or wrong lenghted file from fuse client 
it is restored (healed), i.e. gluster recognizes file is wrong and heal 
it , so I do not see any reason to provide this such function as 
self-healing.

>
>>
>>
>>     -- 
>>     Pranith
>
>
>
>
> -- 
> Pranith

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/7343b80a/attachment.html>


More information about the Gluster-users mailing list