[Gluster-users] help
michael at mjvitale.com
michael at mjvitale.com
Wed Sep 30 20:56:22 UTC 2009
Please remove me from the list
Michael at mjvitale.com
-----Original Message-----
From: gluster-users-bounces at gluster.org
[mailto:gluster-users-bounces at gluster.org] On Behalf Of
gluster-users-request at gluster.org
Sent: Wednesday, September 30, 2009 3:00 PM
To: gluster-users at gluster.org
Subject: Gluster-users Digest, Vol 17, Issue 49
Send Gluster-users mailing list submissions to
gluster-users at gluster.org
To subscribe or unsubscribe via the World Wide Web, visit
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
or, via email, send a message with subject or body 'help' to
gluster-users-request at gluster.org
You can reach the person managing the list at
gluster-users-owner at gluster.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Gluster-users digest..."
Today's Topics:
1. speeding up getdents()/stat() and/or general perf tuning...
(mki-glusterfs at mozone.net)
2. Re: speeding up getdents()/stat() and/or general perf
tuning... (Mark Mielke)
3. Re: speeding up getdents()/stat() and/or general perf
tuning... (mki-glusterfs at mozone.net)
----------------------------------------------------------------------
Message: 1
Date: Tue, 29 Sep 2009 16:14:02 -0700
From: mki-glusterfs at mozone.net
Subject: [Gluster-users] speeding up getdents()/stat() and/or general
perf tuning...
To: gluster-users at gluster.org
Message-ID: <20090929231402.GH27294 at cyclonus.mozone.net>
Content-Type: text/plain; charset=us-ascii
Hi
I've been noticing some long delays on doing a simple `ls' in
directories that haven't been recently accessed on a test glusterfs
system we've put together. The system(s) consists of a 4 node
DHT + AFR (x1) setup, running 2.0.6 all with 10GbE connectivity
between the nodes (and no there is no network bottleneck here as
iperf proves that the throughput between the machines is ~9Gbps).
stat("usr/bin", {st_mode=S_IFDIR|0755, st_size=28672, ...}) = 0 <1.659084>
open("usr/bin", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 <0.001231>
fstat(3, {st_mode=S_IFDIR|0755, st_size=28672, ...}) = 0 <0.000007>
fcntl(3, F_GETFD) = 0x1 (flags FD_CLOEXEC) <0.000005>
getdents(3, /* 75 entries */, 4096) = 2256 <0.033347>
getdents(3, /* 75 entries */, 4096) = 2280 <0.001858>
getdents(3, /* 41 entries */, 4096) = 1256 <0.000961>
getdents(3, /* 73 entries */, 4096) = 2216 <0.034120>
getdents(3, /* 73 entries */, 4096) = 2296 <0.001393>
getdents(3, /* 11 entries */, 4096) = 344 <0.000481>
getdents(3, /* 72 entries */, 4096) = 2264 <0.028436>
mmap(NULL, 147456, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x7f061c9b5000 <0.000005>
getdents(3, /* 74 entries */, 4096) = 2304 <0.001711>
getdents(3, /* 42 entries */, 4096) = 1344 <0.001304>
getdents(3, /* 74 entries */, 4096) = 2200 <3.176572> <- SLOW OPERATION
getdents(3, /* 74 entries */, 4096) = 2288 <0.012123>
getdents(3, /* 10 entries */, 4096) = 280 <0.000376>
getdents(3, /* 0 entries */, 4096) = 0 <0.000243>
close(3) = 0 <0.000011>
The client configs are simple (stripped down for simplicity sakes):
brick 1 tcp 10.10.10.11
brick 2 tcp 10.10.10.12
...
brick 8 tcp 10.10.10.18
replicate1 brick1 brick2
replicate2 brick3 brick5
replicate3 brick4 brick7
replicate4 brick6 brick8
distribute replicate1 replicate2 replicate3 replicate4
io-threads thread-count 16
readahead page-count 16
io-cache cache-size 512MB
write-behind cache-size 1MB
The storage nodes are even simpler yet (trimmed):
posix background-unlink yes
posix-locks
io-threads thread-count 16
transport-type tcp
The client nodes are running an unpatched fuse 2.7.4 userland (debian
lenny) and the default kernel 2.6.31.1 fuse module and I'm curious if
the delay in the getdents/stat calls are because of fuse or something
else config wise that I've managed to miss?
Anyone have any recommendations or thoughts as to how to improve the
performance?
Thanks.
Mohan
------------------------------
Message: 2
Date: Tue, 29 Sep 2009 20:53:44 -0400
From: Mark Mielke <mark at mark.mielke.cc>
Subject: Re: [Gluster-users] speeding up getdents()/stat() and/or
general perf tuning...
To: mki-glusterfs at mozone.net
Cc: gluster-users at gluster.org
Message-ID: <4AC2AC18.2040603 at mark.mielke.cc>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
On 09/29/2009 07:14 PM, mki-glusterfs at mozone.net wrote:
> Hi
>
> I've been noticing some long delays on doing a simple `ls' in
> directories that haven't been recently accessed on a test glusterfs
> system we've put together. The system(s) consists of a 4 node
> DHT + AFR (x1) setup, running 2.0.6 all with 10GbE connectivity
> between the nodes (and no there is no network bottleneck here as
> iperf proves that the throughput between the machines is ~9Gbps).
>
I would suggest following advice in another thread:
>
http://www.gluster.com/community/documentation/index.php/Translators/cluster
/distribute
> >
> > It seems to suggest that 'lookup-unhashed' says that the default is
'on'.
> >
> > Perhaps try turning it 'off'?
>
> Wei,
> There are two things we would like you to try. First is what Mark
> has just pointed, the 'option lookup-unhashed off' in distribute. The
> second is 'option transport.socket.nodelay on' in each of your
> protocol/client_and_ protocol/server volumes. Do let us know what
> influence these changes have on your performance.
>
> Avati
>
The TCP_NODELAY seems particularly relevant to me if many small requests
are being issued in sequence as a /bin/ls is likely to do?
The lookup-unhashed might be relevant to stat() calls issued as a part
of the /bin/ls process.
I've been hitting 'ls' problems myself on another system with NFS and
AutoFS where we have a directory with many symlinks in it. The 'ls' does
a stat() on each of the symlinks, and in the case of auto-mount - it can
take a while... :-)
There is a stat-prefetch module that I do not see documentation for. I
wish there was more comments. A quick skim of it suggests that it
*might* be designed to improve /bin/ls performance. That it's not
documented may mean it is for 2.1 or later?
Cheers,
mark
--
Mark Mielke<mark at mielke.cc>
------------------------------
Message: 3
Date: Wed, 30 Sep 2009 07:28:28 -0700
From: mki-glusterfs at mozone.net
Subject: Re: [Gluster-users] speeding up getdents()/stat() and/or
general perf tuning...
To: gluster-users at gluster.org
Message-ID: <20090930142828.GS27294 at cyclonus.mozone.net>
Content-Type: text/plain; charset=us-ascii
On Tue, Sep 29, 2009 at 08:53:44PM -0400, Mark Mielke wrote:
> The TCP_NODELAY seems particularly relevant to me if many small requests
> are being issued in sequence as a /bin/ls is likely to do?
>
> The lookup-unhashed might be relevant to stat() calls issued as a part
> of the /bin/ls process.
Thanks Mark! Indeed this makes a significant difference when coupled with
the lookup-unhashed=off option (which I thought I had in place before
because I recall running into a scenario where replication wasn't working
correctly and I had to set that in order to fix it, but that was in
2.0.3 from what I remember.)
> There is a stat-prefetch module that I do not see documentation for. I
> wish there was more comments. A quick skim of it suggests that it
> *might* be designed to improve /bin/ls performance. That it's not
> documented may mean it is for 2.1 or later?
Interesting. I'll have to poke at the codebase this weekend again as
it's been a while since I last looked at it.
Mohan
------------------------------
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
End of Gluster-users Digest, Vol 17, Issue 49
*********************************************
More information about the Gluster-users
mailing list