replicate/distribute oddities in 2.0.0 Was Re: [Gluster-devel] rc8

Tue May 12 20:13:49 UTC 2009

I was under the impression that the namespace volume is not needed with 2.0
replication.
liam

On Tue, May 12, 2009 at 12:54 PM, Alexey Filin <alexey.filin at gmail.com>wrote:

> Hi,
>
> I'm not sure but where is namespace volume in spec-files?
>
> http://gluster.org/docs/index.php/User_Guide#Namespace
>
> "Namespace volume needed because:
>
> - persistent inode numbers.
> - file exists even when node is down."
>
> cheers,
>
> Alexey.
>
>
> On Tue, May 12, 2009 at 3:29 AM, Liam Slusser <lslusser at gmail.com> wrote:
>
>>
>> Even with manually fixing (adding or removing) the extended attributes i
>> was never able to get Gluster to see the missing files.  So i ended up
>> writing a quick program that searched the raw bricks filesystem and then
>> checked to make sure the file existed in the Gluster cluster and if it
>> didn't it would tag the file.  Once that job was done i shut down Gluster,
>> moved all the missing files off the raw bricks into temp storage, and then i
>> restarted Gluster and copied all the files back into each directory.  That
>> fixed the missing file problems.
>>
>> Id still like to find out why Gluster would ignore certain files without
>> the correct attributes.  Even removing all the file attributes wouldn't fix
>> the problem.  I also tried manually coping a file into a brick which it
>> still wouldn't find.  It would be nice to be able to manual copy files into
>> a brick, then set an extended attribute flag which would cause gluster to
>> see the new file(s) and copy them to all bricks after a ls -alR was done.
>>  Or even better just do it automatically when new files without attributes
>> are found in a brick.
>>
>> thanks,
>> liam
>>
>>
>> On Wed, May 6, 2009 at 4:13 PM, Liam Slusser <lslusser at gmail.com> wrote:
>>
>>>
>>> To answer some of my own question, looks like those files were copied
>>> using gluster 1.3.12 which is why they have the different extended
>>> attributes:
>>> gluster 1.3.12
>>>
>>> Attribute "glusterfs.createtime" has a 10 byte value for file
>>> Attribute "glusterfs.version" has a 1 byte value for file
>>> Attribute "glusterfs.dht" has a 16 byte value for file
>>>
>>> while gluster 2.0.0 has
>>>
>>> Attribute "afr.brick2b" has a 12 byte value for file
>>> Attribute "afr.brick1b" has a 12 byte value for file
>>>
>>> I've been unsuccessful on fixing the attributes, can anybody point me in
>>> the right direction?
>>>
>>> thanks,
>>> liam
>>>
>>> On Wed, May 6, 2009 at 12:48 PM, Liam Slusser <lslusser at gmail.com>wrote:
>>>
>>>>  Big thanks to the devel group for fixing all the memory leak issues
>>>> with the earlier RC releases.  2.0.0 has been great so far without any
>>>> memory issues what-so-ever.
>>>> I am seeing some oddities with the replication/distribute translators
>>>> however.  I have three partitions on each gluster server exporting three
>>>> bricks - We have two servers.  The gluster clients replicates each brick
>>>> between the two servers and then i have a distribute translator for all the
>>>> replicated bricks - basically gluster raid10.
>>>>
>>>> There are a handful of files which have been copied into the gluster
>>>> volume but since have disappeared, however the physical files exist on both
>>>> bricks.
>>>>
>>>> (from a client)
>>>>
>>>> [root at client1 049891002526]# pwd
>>>> /intstore/data/tracks/tmg/2008_02_05/049891002526
>>>> [root at client1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
>>>> ls: 049891002526_01_09.wma.sigKey01.k: No such file or directory
>>>> [root at client1 049891002526]# head 049891002526_01_09.wma.sigKey01.k
>>>> head: cannot open `049891002526_01_09.wma.sigKey01.k' for reading: No
>>>> such file or directory
>>>> [root at client1 049891002526]#
>>>>
>>>>
>>>> (from a server brick)
>>>>
>>>> [root at server1 049891002526]# pwd
>>>> /intstore/intstore01c/gcdata/data/tracks/tmg/2008_02_05/049891002526
>>>> [root at server1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
>>>> -rw-rw-rw- 1 10015 root 19377712 Feb  6  2008
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> [root at server1 049891002526]# attr -l 049891002526_01_09.wma.sigKey01.k
>>>> Attribute "glusterfs.createtime" has a 10 byte value for
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> Attribute "glusterfs.version" has a 1 byte value for
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> Attribute "selinux" has a 24 byte value for
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> [root at server1 049891002526]# attr -l .
>>>> Attribute "glusterfs.createtime" has a 10 byte value for .
>>>> Attribute "glusterfs.version" has a 1 byte value for .
>>>> Attribute "glusterfs.dht" has a 16 byte value for .
>>>> Attribute "selinux" has a 24 byte value for .
>>>>
>>>>
>>>> Nothing in both the client and server logs.  I've tried all the normal
>>>> replication checks and self-heal such as ls -alR.  If i copy the file back
>>>> from one of the bricks into the volume it will show up again however it has
>>>> a 1/3 chance of getting written to the files original location.  So then i
>>>> end up with two identical files on two different bricks.
>>>>
>>>> This volume has over 40 million files and directories so it can be very
>>>> tedious to find anomalies. I wrote a quick perl script to search 1/25
>>>> of our total files in the volume for missing files and md5 checksum
>>>> differences and as of now its about 15% (138,500 files) complete and has
>>>> found ~7000 missing files and 0 md5 checksum differences.
>>>>
>>>> How could i debug this?  I'd image it has something to do with the
>>>> extended attributes on either the file or parent directory...but as far as i
>>>> can tell that all looks fine.
>>>>
>>>> thanks,
>>>> liam
>>>>
>>>> client glusterfs.vol:
>>>>
>>>> volume brick1a
>>>>   type protocol/client
>>>>   option transport-type tcp
>>>>   option remote-host server1
>>>>   option remote-subvolume brick1a
>>>> end-volume
>>>>
>>>> volume brick1b
>>>>   type protocol/client
>>>>   option transport-type tcp
>>>>   option remote-host server1
>>>>   option remote-subvolume brick1b
>>>> end-volume
>>>>
>>>> volume brick1c
>>>>   type protocol/client
>>>>   option transport-type tcp
>>>>   option remote-host server1
>>>>   option remote-subvolume brick1c
>>>> end-volume
>>>>
>>>> volume brick2a
>>>>   type protocol/client
>>>>   option transport-type tcp
>>>>   option remote-host server2
>>>>   option remote-subvolume brick2a
>>>> end-volume
>>>>
>>>> volume brick2b
>>>>   type protocol/client
>>>>   option transport-type tcp
>>>>   option remote-host server2
>>>>   option remote-subvolume brick2b
>>>> end-volume
>>>>
>>>> volume brick2c
>>>>   type protocol/client
>>>>   option transport-type tcp
>>>>   option remote-host server2
>>>>   option remote-subvolume brick2c
>>>> end-volume
>>>>
>>>>  volume bricks1
>>>>   type cluster/replicate
>>>>   subvolumes brick1a brick2a
>>>> end-volume
>>>>
>>>> volume bricks2
>>>>   type cluster/replicate
>>>>   subvolumes brick1b brick2b
>>>> end-volume
>>>>
>>>> volume bricks3
>>>>   type cluster/replicate
>>>>   subvolumes brick1c brick2c
>>>> end-volume
>>>>
>>>> volume distribute
>>>>   type cluster/distribute
>>>>   subvolumes bricks1 bricks2 bricks3
>>>> end-volume
>>>>
>>>> volume writebehind
>>>>   type performance/write-behind
>>>>   option block-size 1MB
>>>>   option cache-size 64MB
>>>>   option flush-behind on
>>>>   subvolumes distribute
>>>> end-volume
>>>>
>>>> volume cache
>>>>   type performance/io-cache
>>>>   option cache-size 2048MB
>>>>   subvolumes writebehind
>>>> end-volume
>>>>
>>>> server glusterfsd.vol:
>>>>
>>>> volume intstore01a
>>>>   type storage/posix
>>>>   option directory /intstore/intstore01a/gcdata
>>>> end-volume
>>>>
>>>> volume intstore01b
>>>>   type storage/posix
>>>>   option directory /intstore/intstore01b/gcdata
>>>> end-volume
>>>>
>>>> volume intstore01c
>>>>   type storage/posix
>>>>   option directory /intstore/intstore01c/gcdata
>>>> end-volume
>>>>
>>>> volume locksa
>>>>   type features/posix-locks
>>>>   option mandatory-locks on
>>>>   subvolumes intstore01a
>>>> end-volume
>>>>
>>>> volume locksb
>>>>   type features/posix-locks
>>>>   option mandatory-locks on
>>>>   subvolumes intstore01b
>>>> end-volume
>>>>
>>>> volume locksc
>>>>   type features/posix-locks
>>>>   option mandatory-locks on
>>>>   subvolumes intstore01c
>>>> end-volume
>>>>
>>>> volume brick1a
>>>>   type performance/io-threads
>>>>   option thread-count 32
>>>>   subvolumes locksa
>>>> end-volume
>>>>
>>>> volume brick1b
>>>>   type performance/io-threads
>>>>   option thread-count 32
>>>>   subvolumes locksb
>>>> end-volume
>>>>
>>>> volume brick1c
>>>>   type performance/io-threads
>>>>   option thread-count 32
>>>>   subvolumes locksc
>>>> end-volume
>>>>
>>>> volume server
>>>>   type protocol/server
>>>>   option transport-type tcp
>>>>   option auth.addr.brick1a.allow 192.168.12.*
>>>>   option auth.addr.brick1b.allow 192.168.12.*
>>>>   option auth.addr.brick1c.allow 192.168.12.*
>>>>   subvolumes brick1a brick1b brick1c
>>>> end-volume
>>>>
>>>>
>>>> On Wed, Apr 22, 2009 at 5:43 PM, Liam Slusser <lslusser at gmail.com>wrote:
>>>>
>>>>>
>>>>> Avati,
>>>>> Big thanks.  Looks like that did the trick.  I'll report back in the
>>>>> morning if anything has changed but its looking MUCH better.  Thanks again!
>>>>>
>>>>> liam
>>>>>
>>>>> On Wed, Apr 22, 2009 at 2:32 PM, Anand Avati <avati at gluster.com>wrote:
>>>>>
>>>>>> Liam,
>>>>>>  An fd leak and a lock structure leak has been fixed in the git
>>>>>> repository, which explains a leak in the first subvolume's server.
>>>>>> Please pull the latest patches and let us know if it does not fixe
>>>>>> your issues. Thanks!
>>>>>>
>>>>>> Avati
>>>>>>
>>>>>> On Tue, Apr 21, 2009 at 3:41 PM, Liam Slusser <lslusser at gmail.com>
>>>>>> wrote:
>>>>>> > There is still a memory leak with rc8 on my setup.  The first server
>>>>>> in a
>>>>>> > cluster or two servers starts out using 18M and just slowly
>>>>>> increases.
>>>>>> >  After 30mins it has doubled in size to over 30M and just keeps
>>>>>> growing -
>>>>>> > the more memory it uses the worst the performance.  Funny that the
>>>>>> second
>>>>>> > server in my cluster using the same configuration file has no such
>>>>>> memory
>>>>>> > problem.
>>>>>> > My glusterfsd.vol has no performance translators, just 3
>>>>>> storage/posix -> 3
>>>>>> > features/posix-locks -> protocol/server.
>>>>>> > thanks,
>>>>>> > liam
>>>>>> > On Mon, Apr 20, 2009 at 2:01 PM, Gordan Bobic <gordan at bobich.net>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Gordan Bobic wrote:
>>>>>> >>>
>>>>>> >>> First-access failing bug still seems to be present.
>>>>>> >>> But other than that, it seems to be distinctly better than rc4. :)
>>>>>> >>> Good work! :)
>>>>>> >>
>>>>>> >> And that massive memory leak is gone, too! The process hasn't grown
>>>>>> by a
>>>>>> >> KB after a kernel compile! :D
>>>>>> >>
>>>>>> >> s/Good work/Awesome work/
>>>>>> >>
>>>>>> >> :)
>>>>>> >>
>>>>>> >>
>>>>>> >> Gordan
>>>>>> >>
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> Gluster-devel mailing list
>>>>>> >> Gluster-devel at nongnu.org
>>>>>> >> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > Gluster-devel mailing list
>>>>>> > Gluster-devel at nongnu.org
>>>>>> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>>> >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090512/240c1eb2/attachment-0003.html>