replicate/distribute oddities in 2.0.0 Was Re: [Gluster-devel] rc8

Alexey Filin alexey.filin at gmail.com
Tue May 12 19:54:20 UTC 2009


Hi,

I'm not sure but where is namespace volume in spec-files?

http://gluster.org/docs/index.php/User_Guide#Namespace

"Namespace volume needed because:

- persistent inode numbers.
- file exists even when node is down."

cheers,

Alexey.

On Tue, May 12, 2009 at 3:29 AM, Liam Slusser <lslusser at gmail.com> wrote:

>
> Even with manually fixing (adding or removing) the extended attributes i
> was never able to get Gluster to see the missing files.  So i ended up
> writing a quick program that searched the raw bricks filesystem and then
> checked to make sure the file existed in the Gluster cluster and if it
> didn't it would tag the file.  Once that job was done i shut down Gluster,
> moved all the missing files off the raw bricks into temp storage, and then i
> restarted Gluster and copied all the files back into each directory.  That
> fixed the missing file problems.
>
> Id still like to find out why Gluster would ignore certain files without
> the correct attributes.  Even removing all the file attributes wouldn't fix
> the problem.  I also tried manually coping a file into a brick which it
> still wouldn't find.  It would be nice to be able to manual copy files into
> a brick, then set an extended attribute flag which would cause gluster to
> see the new file(s) and copy them to all bricks after a ls -alR was done.
>  Or even better just do it automatically when new files without attributes
> are found in a brick.
>
> thanks,
> liam
>
>
> On Wed, May 6, 2009 at 4:13 PM, Liam Slusser <lslusser at gmail.com> wrote:
>
>>
>> To answer some of my own question, looks like those files were copied
>> using gluster 1.3.12 which is why they have the different extended
>> attributes:
>> gluster 1.3.12
>>
>> Attribute "glusterfs.createtime" has a 10 byte value for file
>> Attribute "glusterfs.version" has a 1 byte value for file
>> Attribute "glusterfs.dht" has a 16 byte value for file
>>
>> while gluster 2.0.0 has
>>
>> Attribute "afr.brick2b" has a 12 byte value for file
>> Attribute "afr.brick1b" has a 12 byte value for file
>>
>> I've been unsuccessful on fixing the attributes, can anybody point me in
>> the right direction?
>>
>> thanks,
>> liam
>>
>> On Wed, May 6, 2009 at 12:48 PM, Liam Slusser <lslusser at gmail.com> wrote:
>>
>>>  Big thanks to the devel group for fixing all the memory leak issues with
>>> the earlier RC releases.  2.0.0 has been great so far without any memory
>>> issues what-so-ever.
>>> I am seeing some oddities with the replication/distribute translators
>>> however.  I have three partitions on each gluster server exporting three
>>> bricks - We have two servers.  The gluster clients replicates each brick
>>> between the two servers and then i have a distribute translator for all the
>>> replicated bricks - basically gluster raid10.
>>>
>>> There are a handful of files which have been copied into the gluster
>>> volume but since have disappeared, however the physical files exist on both
>>> bricks.
>>>
>>> (from a client)
>>>
>>> [root at client1 049891002526]# pwd
>>> /intstore/data/tracks/tmg/2008_02_05/049891002526
>>> [root at client1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
>>> ls: 049891002526_01_09.wma.sigKey01.k: No such file or directory
>>> [root at client1 049891002526]# head 049891002526_01_09.wma.sigKey01.k
>>> head: cannot open `049891002526_01_09.wma.sigKey01.k' for reading: No
>>> such file or directory
>>> [root at client1 049891002526]#
>>>
>>>
>>> (from a server brick)
>>>
>>> [root at server1 049891002526]# pwd
>>> /intstore/intstore01c/gcdata/data/tracks/tmg/2008_02_05/049891002526
>>> [root at server1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
>>> -rw-rw-rw- 1 10015 root 19377712 Feb  6  2008
>>> 049891002526_01_09.wma.sigKey01.k
>>> [root at server1 049891002526]# attr -l 049891002526_01_09.wma.sigKey01.k
>>> Attribute "glusterfs.createtime" has a 10 byte value for
>>> 049891002526_01_09.wma.sigKey01.k
>>> Attribute "glusterfs.version" has a 1 byte value for
>>> 049891002526_01_09.wma.sigKey01.k
>>> Attribute "selinux" has a 24 byte value for
>>> 049891002526_01_09.wma.sigKey01.k
>>> [root at server1 049891002526]# attr -l .
>>> Attribute "glusterfs.createtime" has a 10 byte value for .
>>> Attribute "glusterfs.version" has a 1 byte value for .
>>> Attribute "glusterfs.dht" has a 16 byte value for .
>>> Attribute "selinux" has a 24 byte value for .
>>>
>>>
>>> Nothing in both the client and server logs.  I've tried all the normal
>>> replication checks and self-heal such as ls -alR.  If i copy the file back
>>> from one of the bricks into the volume it will show up again however it has
>>> a 1/3 chance of getting written to the files original location.  So then i
>>> end up with two identical files on two different bricks.
>>>
>>> This volume has over 40 million files and directories so it can be very
>>> tedious to find anomalies. I wrote a quick perl script to search 1/25 of
>>> our total files in the volume for missing files and md5 checksum differences
>>> and as of now its about 15% (138,500 files) complete and has found ~7000
>>> missing files and 0 md5 checksum differences.
>>>
>>> How could i debug this?  I'd image it has something to do with the
>>> extended attributes on either the file or parent directory...but as far as i
>>> can tell that all looks fine.
>>>
>>> thanks,
>>> liam
>>>
>>> client glusterfs.vol:
>>>
>>> volume brick1a
>>>   type protocol/client
>>>   option transport-type tcp
>>>   option remote-host server1
>>>   option remote-subvolume brick1a
>>> end-volume
>>>
>>> volume brick1b
>>>   type protocol/client
>>>   option transport-type tcp
>>>   option remote-host server1
>>>   option remote-subvolume brick1b
>>> end-volume
>>>
>>> volume brick1c
>>>   type protocol/client
>>>   option transport-type tcp
>>>   option remote-host server1
>>>   option remote-subvolume brick1c
>>> end-volume
>>>
>>> volume brick2a
>>>   type protocol/client
>>>   option transport-type tcp
>>>   option remote-host server2
>>>   option remote-subvolume brick2a
>>> end-volume
>>>
>>> volume brick2b
>>>   type protocol/client
>>>   option transport-type tcp
>>>   option remote-host server2
>>>   option remote-subvolume brick2b
>>> end-volume
>>>
>>> volume brick2c
>>>   type protocol/client
>>>   option transport-type tcp
>>>   option remote-host server2
>>>   option remote-subvolume brick2c
>>> end-volume
>>>
>>>  volume bricks1
>>>   type cluster/replicate
>>>   subvolumes brick1a brick2a
>>> end-volume
>>>
>>> volume bricks2
>>>   type cluster/replicate
>>>   subvolumes brick1b brick2b
>>> end-volume
>>>
>>> volume bricks3
>>>   type cluster/replicate
>>>   subvolumes brick1c brick2c
>>> end-volume
>>>
>>> volume distribute
>>>   type cluster/distribute
>>>   subvolumes bricks1 bricks2 bricks3
>>> end-volume
>>>
>>> volume writebehind
>>>   type performance/write-behind
>>>   option block-size 1MB
>>>   option cache-size 64MB
>>>   option flush-behind on
>>>   subvolumes distribute
>>> end-volume
>>>
>>> volume cache
>>>   type performance/io-cache
>>>   option cache-size 2048MB
>>>   subvolumes writebehind
>>> end-volume
>>>
>>> server glusterfsd.vol:
>>>
>>> volume intstore01a
>>>   type storage/posix
>>>   option directory /intstore/intstore01a/gcdata
>>> end-volume
>>>
>>> volume intstore01b
>>>   type storage/posix
>>>   option directory /intstore/intstore01b/gcdata
>>> end-volume
>>>
>>> volume intstore01c
>>>   type storage/posix
>>>   option directory /intstore/intstore01c/gcdata
>>> end-volume
>>>
>>> volume locksa
>>>   type features/posix-locks
>>>   option mandatory-locks on
>>>   subvolumes intstore01a
>>> end-volume
>>>
>>> volume locksb
>>>   type features/posix-locks
>>>   option mandatory-locks on
>>>   subvolumes intstore01b
>>> end-volume
>>>
>>> volume locksc
>>>   type features/posix-locks
>>>   option mandatory-locks on
>>>   subvolumes intstore01c
>>> end-volume
>>>
>>> volume brick1a
>>>   type performance/io-threads
>>>   option thread-count 32
>>>   subvolumes locksa
>>> end-volume
>>>
>>> volume brick1b
>>>   type performance/io-threads
>>>   option thread-count 32
>>>   subvolumes locksb
>>> end-volume
>>>
>>> volume brick1c
>>>   type performance/io-threads
>>>   option thread-count 32
>>>   subvolumes locksc
>>> end-volume
>>>
>>> volume server
>>>   type protocol/server
>>>   option transport-type tcp
>>>   option auth.addr.brick1a.allow 192.168.12.*
>>>   option auth.addr.brick1b.allow 192.168.12.*
>>>   option auth.addr.brick1c.allow 192.168.12.*
>>>   subvolumes brick1a brick1b brick1c
>>> end-volume
>>>
>>>
>>> On Wed, Apr 22, 2009 at 5:43 PM, Liam Slusser <lslusser at gmail.com>wrote:
>>>
>>>>
>>>> Avati,
>>>> Big thanks.  Looks like that did the trick.  I'll report back in the
>>>> morning if anything has changed but its looking MUCH better.  Thanks again!
>>>>
>>>> liam
>>>>
>>>> On Wed, Apr 22, 2009 at 2:32 PM, Anand Avati <avati at gluster.com> wrote:
>>>>
>>>>> Liam,
>>>>>  An fd leak and a lock structure leak has been fixed in the git
>>>>> repository, which explains a leak in the first subvolume's server.
>>>>> Please pull the latest patches and let us know if it does not fixe
>>>>> your issues. Thanks!
>>>>>
>>>>> Avati
>>>>>
>>>>> On Tue, Apr 21, 2009 at 3:41 PM, Liam Slusser <lslusser at gmail.com>
>>>>> wrote:
>>>>> > There is still a memory leak with rc8 on my setup.  The first server
>>>>> in a
>>>>> > cluster or two servers starts out using 18M and just slowly
>>>>> increases.
>>>>> >  After 30mins it has doubled in size to over 30M and just keeps
>>>>> growing -
>>>>> > the more memory it uses the worst the performance.  Funny that the
>>>>> second
>>>>> > server in my cluster using the same configuration file has no such
>>>>> memory
>>>>> > problem.
>>>>> > My glusterfsd.vol has no performance translators, just 3
>>>>> storage/posix -> 3
>>>>> > features/posix-locks -> protocol/server.
>>>>> > thanks,
>>>>> > liam
>>>>> > On Mon, Apr 20, 2009 at 2:01 PM, Gordan Bobic <gordan at bobich.net>
>>>>> wrote:
>>>>> >>
>>>>> >> Gordan Bobic wrote:
>>>>> >>>
>>>>> >>> First-access failing bug still seems to be present.
>>>>> >>> But other than that, it seems to be distinctly better than rc4. :)
>>>>> >>> Good work! :)
>>>>> >>
>>>>> >> And that massive memory leak is gone, too! The process hasn't grown
>>>>> by a
>>>>> >> KB after a kernel compile! :D
>>>>> >>
>>>>> >> s/Good work/Awesome work/
>>>>> >>
>>>>> >> :)
>>>>> >>
>>>>> >>
>>>>> >> Gordan
>>>>> >>
>>>>> >>
>>>>> >> _______________________________________________
>>>>> >> Gluster-devel mailing list
>>>>> >> Gluster-devel at nongnu.org
>>>>> >> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Gluster-devel mailing list
>>>>> > Gluster-devel at nongnu.org
>>>>> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>> >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090512/65f2dc9b/attachment-0003.html>


More information about the Gluster-devel mailing list