[Gluster-devel] Selfheal is not working? Once more

Thu Jul 31 11:57:48 UTC 2008

Hi Lukasz,

I also had same problem. I'm using 2 servers A & B, acting as servers as
well as clients. Using simple AFR, nothing complicated.

A <--> B

I was running 2 separate processes for client and server on same server. I
face 1970 file/directory creation/modify date time problem on server A. on
server B everything was fine. 

After changing my configuration file and switching to single process. The
error is solved.

Any ways try changing your config file, might work. It worked for me.

Current, I'm facing problem of Selfheal for a directory. This is a different
set up. I've 3 servers in one AFR group. One server had problem, forced to
shutdown for night. When server came back, normally it used to self heal
(after running find command). But this time it didn't create directory. So
consecutive folders/files didn't get heal. I'm getting following error.

2008-07-31 14:10:05 W [posix.c:524:posix_mkdir] posix1: mkdir of
/mailbox/0/1/XXXXX: No such file or directory

Any help really appreciated.

Rohan

-----Original Message-----
From: gluster-devel-bounces+rohan.thale=moneycontrol.com at nongnu.org
[mailto:gluster-devel-bounces+rohan.thale=moneycontrol.com at nongnu.org] On
Behalf Of Lukasz Osipiuk
Sent: Thursday, July 31, 2008 12:04 PM
To: Raghavendra G
Cc: gluster-devel at nongnu.org
Subject: Re: [Gluster-devel] Selfheal is not working? Once more

2008/7/31 Raghavendra G <raghavendra.hg at gmail.com>:
> Hi,
>
> Can you do a _find . | xargs touch_ and check whether brick A is
> self-healed?

Strange thing. After a night all files appeared on brick A
but empty and with creation date jan 1 1970 and without any extended
attributes.
Maybe slocate deamon touched them?

After another one shutdown/delete/startup/find .| xargs touch
It worked. Thanks a lot :)

I realized that previously I was doing "access on client phase" to
soon, yet befor client established TCP
connection with new brick A daemon. Now it brick self healed :)

There is still minor (?) issue with directories. They reclaimed
extended attributes but creation date displayed by ls is Jan 1 1970
after self heal (both on brick A and on client).
Is this known bug/feature?

Regards, Lukasz

> regards,
>
> On Thu, Jul 31, 2008 at 4:07 AM, Lukasz Osipiuk <lukasz at osipiuk.net>
wrote:
>>
>> Thanks for answers :)
>>
>> On Wed, Jul 30, 2008 at 8:52 PM, Martin Fick <mogulguy at yahoo.com> wrote:
>> > --- On Wed, 7/30/08, Lukasz Osipiuk <lukasz at osipiuk.net> wrote:
>> >
>>
>> [cut]
>>
>> >> The more extreme example is: on of data bricks explodes and
>> >> You replace it with new one, configured as one which gone off
>> >> but with empty HD. This is the same as above
>> >> experiment but all data is gone, not just one file.
>> >
>> > AFR should actually handle this case fine.  When you install
>> > a new brick and it is empty, there will be no metadata for
>> > any files or directories on it so it will self(lazy) heal.
>> > The problem that you described above occurs because you have
>> > metadata saying that your files (directory actually) is
>> > up to date, but the directory is not since it was modified
>> > manually under the hood.  AFR cannot detect this (yet), it
>> > trusts its metadata.
>>
>> Well, either I am doing something terribly wrong or it does not handle
>> this case fine.
>> I have following configuration.
>> 6 bricks: A, B, C, D, E, F
>> On client I do
>> IO-CACHE(
>>  IO-THREADS(
>>    WRITE-BEHIND
>>      READ_AHEAD(
>>        UNIFY(
>>          DATA(AFR(A,B), AFR(C,D)), NS(AFR(E,F)
>>        )
>>      )
>>    )
>>  )
>> )
>>
>> I do:
>> 1. mount glusterfs on client
>> 2. on client create few files/directories on mounted glusterfs
>> 3. shutdown brick A
>> 4. delete and recreate brick A local directory
>> 5. startup brick A
>> 6. on client access all files in mounted glusterfs directory.
>>
>> After such procedure no files/directories appear in local brick A
>> directory? Should they or I am missing something?
>>
>>
>> I think the file checksuming you described is overkill for my needs.
>> I think I will know if one of my HD drives brakes down and I will
>> replace it, but I need to workaround problem  with data recreation
>> described above.
>>
>>
>> --
>> Lukasz Osipiuk
>> mailto: lukasz at osipiuk.net
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Raghavendra G
>
> A centipede was happy quite, until a toad in fun,
> Said, "Prey, which leg comes after which?",
> This raised his doubts to such a pitch,
> He fell flat into the ditch,
> Not knowing how to run.
> -Anonymous
>

-- 
Lukasz Osipiuk
mailto: lukasz at osipiuk.net