replicate/distribute oddities in 2.0.0 Was Re: [Gluster-devel] rc8
Liam Slusser
lslusser at gmail.com
Wed May 6 19:48:26 UTC 2009
Big thanks to the devel group for fixing all the memory leak issues with
the earlier RC releases. 2.0.0 has been great so far without any memory
issues what-so-ever.
I am seeing some oddities with the replication/distribute translators
however. I have three partitions on each gluster server exporting three
bricks - We have two servers. The gluster clients replicates each brick
between the two servers and then i have a distribute translator for all the
replicated bricks - basically gluster raid10.
There are a handful of files which have been copied into the gluster volume
but since have disappeared, however the physical files exist on both bricks.
(from a client)
[root at client1 049891002526]# pwd
/intstore/data/tracks/tmg/2008_02_05/049891002526
[root at client1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
ls: 049891002526_01_09.wma.sigKey01.k: No such file or directory
[root at client1 049891002526]# head 049891002526_01_09.wma.sigKey01.k
head: cannot open `049891002526_01_09.wma.sigKey01.k' for reading: No such
file or directory
[root at client1 049891002526]#
(from a server brick)
[root at server1 049891002526]# pwd
/intstore/intstore01c/gcdata/data/tracks/tmg/2008_02_05/049891002526
[root at server1 049891002526]# ls -al 049891002526_01_09.wma.sigKey01.k
-rw-rw-rw- 1 10015 root 19377712 Feb 6 2008
049891002526_01_09.wma.sigKey01.k
[root at server1 049891002526]# attr -l 049891002526_01_09.wma.sigKey01.k
Attribute "glusterfs.createtime" has a 10 byte value for
049891002526_01_09.wma.sigKey01.k
Attribute "glusterfs.version" has a 1 byte value for
049891002526_01_09.wma.sigKey01.k
Attribute "selinux" has a 24 byte value for
049891002526_01_09.wma.sigKey01.k
[root at server1 049891002526]# attr -l .
Attribute "glusterfs.createtime" has a 10 byte value for .
Attribute "glusterfs.version" has a 1 byte value for .
Attribute "glusterfs.dht" has a 16 byte value for .
Attribute "selinux" has a 24 byte value for .
Nothing in both the client and server logs. I've tried all the normal
replication checks and self-heal such as ls -alR. If i copy the file back
from one of the bricks into the volume it will show up again however it has
a 1/3 chance of getting written to the files original location. So then i
end up with two identical files on two different bricks.
This volume has over 40 million files and directories so it can be very
tedious to find anomalies. I wrote a quick perl script to search 1/25 of our
total files in the volume for missing files and md5 checksum differences and
as of now its about 15% (138,500 files) complete and has found ~7000 missing
files and 0 md5 checksum differences.
How could i debug this? I'd image it has something to do with the
extended attributes on either the file or parent directory...but as far as i
can tell that all looks fine.
thanks,
liam
client glusterfs.vol:
volume brick1a
type protocol/client
option transport-type tcp
option remote-host server1
option remote-subvolume brick1a
end-volume
volume brick1b
type protocol/client
option transport-type tcp
option remote-host server1
option remote-subvolume brick1b
end-volume
volume brick1c
type protocol/client
option transport-type tcp
option remote-host server1
option remote-subvolume brick1c
end-volume
volume brick2a
type protocol/client
option transport-type tcp
option remote-host server2
option remote-subvolume brick2a
end-volume
volume brick2b
type protocol/client
option transport-type tcp
option remote-host server2
option remote-subvolume brick2b
end-volume
volume brick2c
type protocol/client
option transport-type tcp
option remote-host server2
option remote-subvolume brick2c
end-volume
volume bricks1
type cluster/replicate
subvolumes brick1a brick2a
end-volume
volume bricks2
type cluster/replicate
subvolumes brick1b brick2b
end-volume
volume bricks3
type cluster/replicate
subvolumes brick1c brick2c
end-volume
volume distribute
type cluster/distribute
subvolumes bricks1 bricks2 bricks3
end-volume
volume writebehind
type performance/write-behind
option block-size 1MB
option cache-size 64MB
option flush-behind on
subvolumes distribute
end-volume
volume cache
type performance/io-cache
option cache-size 2048MB
subvolumes writebehind
end-volume
server glusterfsd.vol:
volume intstore01a
type storage/posix
option directory /intstore/intstore01a/gcdata
end-volume
volume intstore01b
type storage/posix
option directory /intstore/intstore01b/gcdata
end-volume
volume intstore01c
type storage/posix
option directory /intstore/intstore01c/gcdata
end-volume
volume locksa
type features/posix-locks
option mandatory-locks on
subvolumes intstore01a
end-volume
volume locksb
type features/posix-locks
option mandatory-locks on
subvolumes intstore01b
end-volume
volume locksc
type features/posix-locks
option mandatory-locks on
subvolumes intstore01c
end-volume
volume brick1a
type performance/io-threads
option thread-count 32
subvolumes locksa
end-volume
volume brick1b
type performance/io-threads
option thread-count 32
subvolumes locksb
end-volume
volume brick1c
type performance/io-threads
option thread-count 32
subvolumes locksc
end-volume
volume server
type protocol/server
option transport-type tcp
option auth.addr.brick1a.allow 192.168.12.*
option auth.addr.brick1b.allow 192.168.12.*
option auth.addr.brick1c.allow 192.168.12.*
subvolumes brick1a brick1b brick1c
end-volume
On Wed, Apr 22, 2009 at 5:43 PM, Liam Slusser <lslusser at gmail.com> wrote:
>
> Avati,
> Big thanks. Looks like that did the trick. I'll report back in the
> morning if anything has changed but its looking MUCH better. Thanks again!
>
> liam
>
> On Wed, Apr 22, 2009 at 2:32 PM, Anand Avati <avati at gluster.com> wrote:
>
>> Liam,
>> An fd leak and a lock structure leak has been fixed in the git
>> repository, which explains a leak in the first subvolume's server.
>> Please pull the latest patches and let us know if it does not fixe
>> your issues. Thanks!
>>
>> Avati
>>
>> On Tue, Apr 21, 2009 at 3:41 PM, Liam Slusser <lslusser at gmail.com> wrote:
>> > There is still a memory leak with rc8 on my setup. The first server in
>> a
>> > cluster or two servers starts out using 18M and just slowly increases.
>> > After 30mins it has doubled in size to over 30M and just keeps growing
>> -
>> > the more memory it uses the worst the performance. Funny that the
>> second
>> > server in my cluster using the same configuration file has no such
>> memory
>> > problem.
>> > My glusterfsd.vol has no performance translators, just 3 storage/posix
>> -> 3
>> > features/posix-locks -> protocol/server.
>> > thanks,
>> > liam
>> > On Mon, Apr 20, 2009 at 2:01 PM, Gordan Bobic <gordan at bobich.net>
>> wrote:
>> >>
>> >> Gordan Bobic wrote:
>> >>>
>> >>> First-access failing bug still seems to be present.
>> >>> But other than that, it seems to be distinctly better than rc4. :)
>> >>> Good work! :)
>> >>
>> >> And that massive memory leak is gone, too! The process hasn't grown by
>> a
>> >> KB after a kernel compile! :D
>> >>
>> >> s/Good work/Awesome work/
>> >>
>> >> :)
>> >>
>> >>
>> >> Gordan
>> >>
>> >>
>> >> _______________________________________________
>> >> Gluster-devel mailing list
>> >> Gluster-devel at nongnu.org
>> >> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>> >
>> >
>> > _______________________________________________
>> > Gluster-devel mailing list
>> > Gluster-devel at nongnu.org
>> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
>> >
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090506/f64c498a/attachment-0003.html>
More information about the Gluster-devel
mailing list