replicate/distribute oddities in 2.0.0 Was Re: [Gluster-devel] rc8

Gordan Bobic gordan at bobich.net
Tue May 12 09:04:34 UTC 2009


On Tue, 12 May 2009 00:53:06 -0700, Liam Slusser <lslusser at gmail.com>
wrote:

>>> Even with manually fixing (adding or removing) the extended attributes
i
>>> was never able to get Gluster to see the missing files.  So i ended up
>>> writing a quick program that searched the raw bricks filesystem and
then
>>> checked to make sure the file existed in the Gluster cluster and if it
>>> didn't it would tag the file.  Once that job was done i shut down
>>> Gluster,
>>> moved all the missing files off the raw bricks into temp storage, and
>>> then i
>>> restarted Gluster and copied all the files back into each directory. 
>>> That fixed the missing file problems.
>>>
>>> Id still like to find out why Gluster would ignore certain files
without
>>> the correct attributes.  Even removing all the file attributes wouldn't
>>> fix
>>> the problem.  I also tried manually coping a file into a brick which it
>>> still wouldn't find.  It would be nice to be able to manual copy files
>>> into
>>> a brick, then set an extended attribute flag which would cause gluster
>>> to
>>> see the new file(s) and copy them to all bricks after a ls -alR was
>>> done.
>>>  Or even better just do it automatically when new files without
>>>  attributes are found in a brick.
>>>
>>
>> It sounds like you are experiencing this known yet dangerous bug:
>>
http://gluster.org/docs/index.php/Understanding_AFR_Translator#Known_Issues
>>
>> Quote:
>> Self-heal of a file that does not exist on the first subvolume:
>> If a file does not exist on the first subvolume but exists on some other
>> subvolume, it will not show up in the output of 'ls'. This is because
the
>> replicate translator fetches the directory listing only from the first
>> subvolume. Thus, the file that does not exist on the first subvolume is
>> never seen and never healed. However, if you know the name of the file
>> and
>> do a 'stat' on the file or try to access it in any other way, the file
>> will be properly healed and created on the first subvolume.
>
> Interesting.  Thanks for replying.  Yeah this does sound like the bug.
>  However i was not able to stat or access the file what-so-ever.  It
always
> replied with "file not found" and nothing in the logs.  Could this be
> caused because the whole directory is missing on the first volume?

Sounds plausible. There are also other, more subtle issues still present in
AFR/Replicate (e.g. BerkeleyDB doesn't work at all, and SQLite sort of
works more often than not, but it's very twitchy). I wouldn't deploy it
into a production environment as it is at the moment.

Gordan





More information about the Gluster-devel mailing list