[Gluster-devel] writebehind slowness
Sebastien LELIEVRE
slelievre at tbs-internet.com
Thu Aug 2 12:48:45 UTC 2007
Hi everyone,
Here is my turn to give you a sum up of lately test results
first of all, a little reminder :
We're using FUSE 2.6.5
our GlusterFS is tla patched 403
our config works like this, basically :
http://users.info.unicaen.fr/~slelievr/tbs-gluster.jpg
we are using a customed bench-script as this :
#/bin/bash
# create and delete 100 files with given SIZE
MAX=1000
SIZE=1024
COUNT=10
[ -n "$1" ] && COUNT=$1
id=0
mkdir bench
while [ $id -lt $MAX ] ; do
(( id+=1 ))
echo "File id : $id"
dd if=/boot/vmlinuz-2.6.16.31-17tbs-smp of=bench/$id bs=$SIZE
count=$COUNT
rm -f bench/$id
done
rmdir bench
Tests:
* op1 + or2 + or3:
~# time ls /mnt/gluster/home/http2
real 0m50.964s
user 0m0.000s
sys 0m0.004s
~# time /tmp/bench.sh 0
real 0m41.734s
user 0m0.736s
sys 0m1.496s
* or2 + or3:
~# time ls /mnt/gluster/home/http2
real 0m6.303s
user 0m0.004s
sys 0m0.008s
~# time /tmp/bench.sh 0
real 0m14.557s
user 0m0.684s
sys 0m1.332s
Avati told us to try without namespace on op1, here how it's been:
* 1 replicated-namespace on or2 et or3 and 1 afr between or2 and or3
~# time /tmp/bench.sh 0
real 0m14.557s
user 0m0.684s
sys 0m1.332s
* 1 namespace on or2 only and 1 afr between or2 and or3
~# time /tmp/bench.sh 0
real 0m10.264s
user 0m0.644s
sys 0m1.328s
* op1 comes back up (still 1 namespace on or2):
~# time /tmp/bench.sh 0
real 0m32.790s
user 0m0.704s
sys 0m1.292s
"--direct-io-mode=write-only" option applied on glusterfs mount doesn't
affect those numbers.
For those tests above, op1 definition was let in the client spec file.
We just cut the op1 glusterfsd server down.
If we delete the op1 server from the client specification, performance
gets even better ! here it is :
* WITH op1 in client specs, op1 down:
* 1 namespace on or2 and 1 afr between or2 and or3
~# time /tmp/bench.sh 0
real 0m10.264s
user 0m0.644s
sys 0m1.328s
* WITHOUT op1 in specs :
~# time /tmp/bench.sh 0
real 0m5.743s
user 0m0.684s
sys 0m1.188s
Avati gave me some hints about improving performance for such an
architecture.
First, we're going to try the FUSE version pimped by the Gluster Team :
http://ftp.zresearch.com/pub/gluster/glusterfs/fuse/fuse-2.7.0-glfs1.tar.gz
it increases the fuse/glusterfs channel size, makes the VM read-ahead
more aggressive, and sets a default blocksize more well suited to use
the increased channel size
In our shell script, we are using BS=1024.. that is VERY BAD (avati said
:>) for a network file system.. we're going to try by using something
like 128KB or 1MB
Having a namespace across 2 distant datacenters is not recommended on
"slow" connection like a 100Mbits one. "Keep namespace near you" they said !
in AFR, we made sure the furthest subvolume is the last in order so that
it gets least preference while reading
Finally, we should try the new feature that allows to put hosts on the
'remote-volume' section instead of IPs, using a Round-Robin DNS.
Trying FUSE 2.7.0 right now, but Harris said on the chan that it didn't
change a thing in his configuration.
I'll keep you in touch, but for now on, do you have any advise ?
Regards,
Sebastien.
More information about the Gluster-devel
mailing list