replicate/distribute oddities in 2.0.0 Was Re: [Gluster-devel] rc8
Liam Slusser
lslusser at gmail.com
Tue May 12 20:13:49 UTC 2009
I was under the impression that the namespace volume is not needed with 2.0
replication.
liam
On Tue, May 12, 2009 at 12:54 PM, Alexey Filin <alexey.filin at gmail.com>wrote:
> Hi,
>
> I'm not sure but where is namespace volume in spec-files?
>
> http://gluster.org/docs/index.php/User_Guide#Namespace
>
> "Namespace volume needed because:
>
> - persistent inode numbers.
> - file exists even when node is down."
>
> cheers,
>
> Alexey.
>
>
> On Tue, May 12, 2009 at 3:29 AM, Liam Slusser <lslusser at gmail.com> wrote:
>
>>
>> Even with manually fixing (adding or removing) the extended attributes i
>> was never able to get Gluster to see the missing files. So i ended up
>> writing a quick program that searched the raw bricks filesystem and then
>> checked to make sure the file existed in the Gluster cluster and if it
>> didn't it would tag the file. Once that job was done i shut down Gluster,
>> moved all the missing files off the raw bricks into temp storage, and then i
>> restarted Gluster and copied all the files back into each directory. That
>> fixed the missing file problems.
>>
>> Id still like to find out why Gluster would ignore certain files without
>> the correct attributes. Even removing all the file attributes wouldn't fix
>> the problem. I also tried manually coping a file into a brick which it
>> still wouldn't find. It would be nice to be able to manual copy files into
>> a brick, then set an extended attribute flag which would cause gluster to
>> see the new file(s) and copy them to all bricks after a ls -alR was done.
>> Or even better just do it automatically when new files without attributes
>> are found in a brick.
>>
>> thanks,
>> liam
>>
>>
>> On Wed, May 6, 2009 at 4:13 PM, Liam Slusser <lslusser at gmail.com> wrote:
>>
>>>
>>> To answer some of my own question, looks like those files were copied
>>> using gluster 1.3.12 which is why they have the different extended
>>> attributes:
>>> gluster 1.3.12
>>>
>>> Attribute "glusterfs.createtime" has a 10 byte value for file
>>> Attribute "glusterfs.version" has a 1 byte value for file
>>> Attribute "glusterfs.dht" has a 16 byte value for file
>>>
>>> while gluster 2.0.0 has
>>>
>>> Attribute "afr.brick2b" has a 12 byte value for file
>>> Attribute "afr.brick1b" has a 12 byte value for file
>>>
>>> I've been unsuccessful on fixing the attributes, can anybody point me in
>>> the right direction?
>>>
>>> thanks,
>>> liam
>>>
>>> On Wed, May 6, 2009 at 12:48 PM, Liam Slusser <lslusser at gmail.com>wrote:
>>>
>>>> Big thanks to the devel group for fixing all the memory leak issues
>>>> with the earlier RC releases. 2.0.0 has been great so far without any
>>>> memory issues what-so-ever.
>>>> I am seeing some oddities with the replication/distribute translators
>>>> however. I have three partitions on each gluster server exporting three
>>>> bricks - We have two servers. The gluster clients replicates each brick
>>>> between the two servers and then i have a distribute translator for all the
>>>> replicated bricks - basically gluster raid10.
>>>>
>>>> There are a handful of files which have been copied into the gluster
>>>> volume but since have disappeared, however the physical files exist on both
>>>> bricks.
>>>>
>>>> (from a client)
>>>>
>>>> [root at client1 049891002526]# pwd
>>>> /intstore/data/tracks/tmg/2008_02_05/049891002526
>>>> [root at client1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
>>>> ls: 049891002526_01_09.wma.sigKey01.k: No such file or directory
>>>> [root at client1 049891002526]# head 049891002526_01_09.wma.sigKey01.k
>>>> head: cannot open `049891002526_01_09.wma.sigKey01.k' for reading: No
>>>> such file or directory
>>>> [root at client1 049891002526]#
>>>>
>>>>
>>>> (from a server brick)
>>>>
>>>> [root at server1 049891002526]# pwd
>>>> /intstore/intstore01c/gcdata/data/tracks/tmg/2008_02_05/049891002526
>>>> [root at server1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
>>>> -rw-rw-rw- 1 10015 root 19377712 Feb 6 2008
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> [root at server1 049891002526]# attr -l 049891002526_01_09.wma.sigKey01.k
>>>> Attribute "glusterfs.createtime" has a 10 byte value for
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> Attribute "glusterfs.version" has a 1 byte value for
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> Attribute "selinux" has a 24 byte value for
>>>> 049891002526_01_09.wma.sigKey01.k
>>>> [root at server1 049891002526]# attr -l .
>>>> Attribute "glusterfs.createtime" has a 10 byte value for .
>>>> Attribute "glusterfs.version" has a 1 byte value for .
>>>> Attribute "glusterfs.dht" has a 16 byte value for .
>>>> Attribute "selinux" has a 24 byte value for .
>>>>
>>>>
>>>> Nothing in both the client and server logs. I've tried all the normal
>>>> replication checks and self-heal such as ls -alR. If i copy the file back
>>>> from one of the bricks into the volume it will show up again however it has
>>>> a 1/3 chance of getting written to the files original location. So then i
>>>> end up with two identical files on two different bricks.
>>>>
>>>> This volume has over 40 million files and directories so it can be very
>>>> tedious to find anomalies. I wrote a quick perl script to search 1/25
>>>> of our total files in the volume for missing files and md5 checksum
>>>> differences and as of now its about 15% (138,500 files) complete and has
>>>> found ~7000 missing files and 0 md5 checksum differences.
>>>>
>>>> How could i debug this? I'd image it has something to do with the
>>>> extended attributes on either the file or parent directory...but as far as i
>>>> can tell that all looks fine.
>>>>
>>>> thanks,
>>>> liam
>>>>
>>>> client glusterfs.vol:
>>>>
>>>> volume brick1a
>>>> type protocol/client
>>>> option transport-type tcp
>>>> option remote-host server1
>>>> option remote-subvolume brick1a
>>>> end-volume
>>>>
>>>> volume brick1b
>>>> type protocol/client
>>>> option transport-type tcp
>>>> option remote-host server1
>>>> option remote-subvolume brick1b
>>>> end-volume
>>>>
>>>> volume brick1c
>>>> type protocol/client
>>>> option transport-type tcp
>>>> option remote-host server1
>>>> option remote-subvolume brick1c
>>>> end-volume
>>>>
>>>> volume brick2a
>>>> type protocol/client
>>>> option transport-type tcp
>>>> option remote-host server2
>>>> option remote-subvolume brick2a
>>>> end-volume
>>>>
>>>> volume brick2b
>>>> type protocol/client
>>>> option transport-type tcp
>>>> option remote-host server2
>>>> option remote-subvolume brick2b
>>>> end-volume
>>>>
>>>> volume brick2c
>>>> type protocol/client
>>>> option transport-type tcp
>>>> option remote-host server2
>>>> option remote-subvolume brick2c
>>>> end-volume
>>>>
>>>> volume bricks1
>>>> type cluster/replicate
>>>> subvolumes brick1a brick2a
>>>> end-volume
>>>>
>>>> volume bricks2
>>>> type cluster/replicate
>>>> subvolumes brick1b brick2b
>>>> end-volume
>>>>
>>>> volume bricks3
>>>> type cluster/replicate
>>>> subvolumes brick1c brick2c
>>>> end-volume
>>>>
>>>> volume distribute
>>>> type cluster/distribute
>>>> subvolumes bricks1 bricks2 bricks3
>>>> end-volume
>>>>
>>>> volume writebehind
>>>> type performance/write-behind
>>>> option block-size 1MB
>>>> option cache-size 64MB
>>>> option flush-behind on
>>>> subvolumes distribute
>>>> end-volume
>>>>
>>>> volume cache
>>>> type performance/io-cache
>>>> option cache-size 2048MB
>>>> subvolumes writebehind
>>>> end-volume
>>>>
>>>> server glusterfsd.vol:
>>>>
>>>> volume intstore01a
>>>> type storage/posix
>>>> option directory /intstore/intstore01a/gcdata
>>>> end-volume
>>>>
>>>> volume intstore01b
>>>> type storage/posix
>>>> option directory /intstore/intstore01b/gcdata
>>>> end-volume
>>>>
>>>> volume intstore01c
>>>> type storage/posix
>>>> option directory /intstore/intstore01c/gcdata
>>>> end-volume
>>>>
>>>> volume locksa
>>>> type features/posix-locks
>>>> option mandatory-locks on
>>>> subvolumes intstore01a
>>>> end-volume
>>>>
>>>> volume locksb
>>>> type features/posix-locks
>>>> option mandatory-locks on
>>>> subvolumes intstore01b
>>>> end-volume
>>>>
>>>> volume locksc
>>>> type features/posix-locks
>>>> option mandatory-locks on
>>>> subvolumes intstore01c
>>>> end-volume
>>>>
>>>> volume brick1a
>>>> type performance/io-threads
>>>> option thread-count 32
>>>> subvolumes locksa
>>>> end-volume
>>>>
>>>> volume brick1b
>>>> type performance/io-threads
>>>> option thread-count 32
>>>> subvolumes locksb
>>>> end-volume
>>>>
>>>> volume brick1c
>>>> type performance/io-threads
>>>> option thread-count 32
>>>> subvolumes locksc
>>>> end-volume
>>>>
>>>> volume server
>>>> type protocol/server
>>>> option transport-type tcp
>>>> option auth.addr.brick1a.allow 192.168.12.*
>>>> option auth.addr.brick1b.allow 192.168.12.*
>>>> option auth.addr.brick1c.allow 192.168.12.*
>>>> subvolumes brick1a brick1b brick1c
>>>> end-volume
>>>>
>>>>
>>>> On Wed, Apr 22, 2009 at 5:43 PM, Liam Slusser <lslusser at gmail.com>wrote:
>>>>
>>>>>
>>>>> Avati,
>>>>> Big thanks. Looks like that did the trick. I'll report back in the
>>>>> morning if anything has changed but its looking MUCH better. Thanks again!
>>>>>
>>>>> liam
>>>>>
>>>>> On Wed, Apr 22, 2009 at 2:32 PM, Anand Avati <avati at gluster.com>wrote:
>>>>>
>>>>>> Liam,
>>>>>> An fd leak and a lock structure leak has been fixed in the git
>>>>>> repository, which explains a leak in the first subvolume's server.
>>>>>> Please pull the latest patches and let us know if it does not fixe
>>>>>> your issues. Thanks!
>>>>>>
>>>>>> Avati
>>>>>>
>>>>>> On Tue, Apr 21, 2009 at 3:41 PM, Liam Slusser <lslusser at gmail.com>
>>>>>> wrote:
>>>>>> > There is still a memory leak with rc8 on my setup. The first server
>>>>>> in a
>>>>>> > cluster or two servers starts out using 18M and just slowly
>>>>>> increases.
>>>>>> > After 30mins it has doubled in size to over 30M and just keeps
>>>>>> growing -
>>>>>> > the more memory it uses the worst the performance. Funny that the
>>>>>> second
>>>>>> > server in my cluster using the same configuration file has no such
>>>>>> memory
>>>>>> > problem.
>>>>>> > My glusterfsd.vol has no performance translators, just 3
>>>>>> storage/posix -> 3
>>>>>> > features/posix-locks -> protocol/server.
>>>>>> > thanks,
>>>>>> > liam
>>>>>> > On Mon, Apr 20, 2009 at 2:01 PM, Gordan Bobic <gordan at bobich.net>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Gordan Bobic wrote:
>>>>>> >>>
>>>>>> >>> First-access failing bug still seems to be present.
>>>>>> >>> But other than that, it seems to be distinctly better than rc4. :)
>>>>>> >>> Good work! :)
>>>>>> >>
>>>>>> >> And that massive memory leak is gone, too! The process hasn't grown
>>>>>> by a
>>>>>> >> KB after a kernel compile! :D
>>>>>> >>
>>>>>> >> s/Good work/Awesome work/
>>>>>> >>
>>>>>> >> :)
>>>>>> >>
>>>>>> >>
>>>>>> >> Gordan
>>>>>> >>
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> Gluster-devel mailing list
>>>>>> >> Gluster-devel at nongnu.org
>>>>>> >> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > Gluster-devel mailing list
>>>>>> > Gluster-devel at nongnu.org
>>>>>> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>>> >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090512/240c1eb2/attachment-0003.html>
More information about the Gluster-devel
mailing list