[Gluster-users] slow writes using replicate with 2.0rc1

Ray Speth speth at MIT.EDU
Tue Feb 3 23:28:33 UTC 2009

I am wondering if anyone knows how to achieve good performance when 
writing to a volume using cluter/replicate. I am getting write speeds of 
about 4 MB/s over Gigabit Ethernet when transferring (single, large) 
files to such a volume. When I remove replication, I get transfer rates 
using "cp" of 21 MB/s. I can get much higher rates using "dd" with large 
block sizes (at least 128KB) -- 34 MB/s for the replicated volume and 
117 MB/s without replication. Read speeds are over 100 MB/s in all 
cases. The transfer rates for cp are comparable to those using dd with a 
block size between 4KB and 8KB.

I am using Gluster 2.0.0rc1 on Ubuntu 8.10 (Intrepid) systems, running 
the stock 2.6.27 kernel. I have tried a range of things with the various 
performance translators, but have not seen much change in performance. I 
  tried applying the FUSE patch, but it appears that the patch for FUSE 
2.7.3 modifies sections of the kernel module that have changed in the 
2.6.27 kernel.

Does anyone know how I can improve the write performance in this type of 
setup? Even the 21 MB/s that I get without replicate would be 
acceptable, even if it's not really making use of all the available 

My server and client configuration files are below. These include 
cluster/distribute as well, but I don't think that has any effect when 
copying a single file.


*** glusterfs-server.vol ***
volume brick0
   type storage/posix
   option directory /scratch/gluster-brick

volume brick1
   type features/locks
   subvolumes brick0

volume brick
   type performance/io-threads
   option thread-count 8
   subvolumes brick1

volume server
   type protocol/server
   option transport-type tcp
   option auth.addr.brick.allow *
   subvolumes brick

*** glusterfs-client.vol ***

volume node01
   type protocol/client
   option transport-type tcp
   option remote-host node01
   option remote-subvolume brick

<snip: etc. for node02...node08>

volume rep1
   type cluster/replicate
   subvolumes node01 node02

< snip: etc. for rep2, rep3, rep4 on the remaining nodes>

volume distribute
   type cluster/distribute
   subvolumes rep1 rep2 rep3 rep4

volume writebehind
   type performance/write-behind
   option block-size 1MB
   option cache-size 4MB
   option flush-behind on
   subvolumes distribute

