replicate/distribute oddities in 2.0.0 Was Re: [Gluster-devel] rc8

Liam Slusser lslusser at gmail.com
Tue May 12 07:53:06 UTC 2009


On Tue, May 12, 2009 at 12:07 AM, Gordan Bobic <gordan at bobich.net> wrote:

> Liam Slusser wrote:
>
>>
>> Even with manually fixing (adding or removing) the extended attributes i
>> was never able to get Gluster to see the missing files.  So i ended up
>> writing a quick program that searched the raw bricks filesystem and then
>> checked to make sure the file existed in the Gluster cluster and if it
>> didn't it would tag the file.  Once that job was done i shut down Gluster,
>> moved all the missing files off the raw bricks into temp storage, and then i
>> restarted Gluster and copied all the files back into each directory.  That
>> fixed the missing file problems.
>>
>> Id still like to find out why Gluster would ignore certain files without
>> the correct attributes.  Even removing all the file attributes wouldn't fix
>> the problem.  I also tried manually coping a file into a brick which it
>> still wouldn't find.  It would be nice to be able to manual copy files into
>> a brick, then set an extended attribute flag which would cause gluster to
>> see the new file(s) and copy them to all bricks after a ls -alR was done.
>>  Or even better just do it automatically when new files without attributes
>> are found in a brick.
>>
>
> It sounds like you are experiencing this known yet dangerous bug:
> http://gluster.org/docs/index.php/Understanding_AFR_Translator#Known_Issues
>
> Quote:
> Self-heal of a file that does not exist on the first subvolume:
> If a file does not exist on the first subvolume but exists on some other
> subvolume, it will not show up in the output of 'ls'. This is because the
> replicate translator fetches the directory listing only from the first
> subvolume. Thus, the file that does not exist on the first subvolume is
> never seen and never healed. However, if you know the name of the file and
> do a 'stat' on the file or try to access it in any other way, the file will
> be properly healed and created on the first subvolume.
>
> So, either the directory listing should be fetched from the read-subvolume,
> or better, fetched from all nodes (but that gets slow). At least if it was
> fetched from the read-subvolume, you could run a cron job on each server
> that ls -laR, which would force the files into sync (since each server
> probably has itself as the read-subvolume, so the missing files will be
> found). But that's not how it seems to work at the moment.
>
>
> Gordan
>

Interesting.  Thanks for replying.  Yeah this does sound like the bug.
 However i was not able to stat or access the file what-so-ever.  It always
replied with "file not found" and nothing in the logs.  Could this be caused
because the whole directory is missing on the first volume?

I do have a test cluster i setup to test new versions/configurations so i
think i can reproduce this scenario - if it would be of any use to
anybody...

thanks,
liam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090512/a765f9ce/attachment-0003.html>


More information about the Gluster-devel mailing list