replicate/distribute oddities in 2.0.0 Was Re: [Gluster-devel] rc8
Gordan Bobic
gordan at bobich.net
Tue May 12 09:04:34 UTC 2009
On Tue, 12 May 2009 00:53:06 -0700, Liam Slusser <lslusser at gmail.com>
wrote:
>>> Even with manually fixing (adding or removing) the extended attributes
i
>>> was never able to get Gluster to see the missing files. So i ended up
>>> writing a quick program that searched the raw bricks filesystem and
then
>>> checked to make sure the file existed in the Gluster cluster and if it
>>> didn't it would tag the file. Once that job was done i shut down
>>> Gluster,
>>> moved all the missing files off the raw bricks into temp storage, and
>>> then i
>>> restarted Gluster and copied all the files back into each directory.
>>> That fixed the missing file problems.
>>>
>>> Id still like to find out why Gluster would ignore certain files
without
>>> the correct attributes. Even removing all the file attributes wouldn't
>>> fix
>>> the problem. I also tried manually coping a file into a brick which it
>>> still wouldn't find. It would be nice to be able to manual copy files
>>> into
>>> a brick, then set an extended attribute flag which would cause gluster
>>> to
>>> see the new file(s) and copy them to all bricks after a ls -alR was
>>> done.
>>> Or even better just do it automatically when new files without
>>> attributes are found in a brick.
>>>
>>
>> It sounds like you are experiencing this known yet dangerous bug:
>>
http://gluster.org/docs/index.php/Understanding_AFR_Translator#Known_Issues
>>
>> Quote:
>> Self-heal of a file that does not exist on the first subvolume:
>> If a file does not exist on the first subvolume but exists on some other
>> subvolume, it will not show up in the output of 'ls'. This is because
the
>> replicate translator fetches the directory listing only from the first
>> subvolume. Thus, the file that does not exist on the first subvolume is
>> never seen and never healed. However, if you know the name of the file
>> and
>> do a 'stat' on the file or try to access it in any other way, the file
>> will be properly healed and created on the first subvolume.
>
> Interesting. Thanks for replying. Yeah this does sound like the bug.
> However i was not able to stat or access the file what-so-ever. It
always
> replied with "file not found" and nothing in the logs. Could this be
> caused because the whole directory is missing on the first volume?
Sounds plausible. There are also other, more subtle issues still present in
AFR/Replicate (e.g. BerkeleyDB doesn't work at all, and SQLite sort of
works more often than not, but it's very twitchy). I wouldn't deploy it
into a production environment as it is at the moment.
Gordan
More information about the Gluster-devel
mailing list