[Gluster-devel] Selfheal is not working? Once more

Wed Jul 30 13:59:07 UTC 2008

Hello,

As I am new to the group let me introduce myself. My name is Lukasz
Osipiuk, and I am
software developer i large Polish IT company. We are considering using
GlusterFS for data storage, and
we have to minimize the probablility of loosing any data.

The following email should be in thread "Selfheal is not working" from
18 Jul but as I was not subscribing
group then I'm quoting quintesence below.

My questions below quoted text.

-----

Martin,
If you are modifying backend directly, you shouldn't do it.
Krishna

On Thu, Jul 17, 2008 at 9:15 PM, Martin Fick <address at hidden> wrote:
> --- On Thu, 7/17/08, Tomáš Siegl <address at hidden> wrote:
>
>> Step1: Client1:  cp test_file.txt /mnt/gluster/
>> Step2: Brick1 and Brick4: has test_file.txt in
>> /mnt/gluster/ directory
>> Sept3: Client1: ls /mnt/gluster - test_file.txt is present
>>
>> Step4: Brick1: rm /mnt/gluster/test_file.txt
>> Step5. Client1: cat /mnt/gluster/test_file.txt -> we
>> will get contents
>> of file from brick4
>>
>> Step6. Brick1 ls /home/export is empty. Selfheal not
>> recovered file.
>
> I suspect that this is normal, you are not suppose to modify
> the bricks manually from underneath AFR.  AFR uses extended
> attributes to keep file version metadata.  When you manually
> deleted the file in step4 the directory version metadata should
> not have been updated so I suspect that caused the mismatch
> to go undetected.  The self heal would have occurred if the
> brick node were down and the file was deleted by client and
> then the brick node returned to operation.
>
> -Martin

------

Martin. It is obvious that one normally should not modify AFR backend directly.
The experiment Tomáš (and me also) made, was a simulation of reallife
problem when you
loose some data on one of data bricks.

The more extreme example is: on of data bricks explodes and You
replace it with new one, configured
as one which gone off but with empty HD. This is the same as above
experiment but all data is gone, not just one file.

Is there a way to make GlusterFS "heal" so the new node contains
replicated data from its mirror?
I tried the find-head pattern but it doesn't help :(
Before new node contains data from the mirror, we are in situation
where lots of files have just one copy,
and that is exactly what we want to avoid using AFR.

Sorry, for poor English.

Regards, Łukasz Osipiuk

--
Łukasz Osipiuk
mailto: lukasz at osipiuk.net