[Gluster-devel] Problem with scheduler load-balancing

Fri Nov 9 17:37:07 UTC 2007

drizzo201-cfs at yahoo.com wrote:
> Single client, single threaded, striped writes run at ~105MB/s.
> Single client, single threaded,  non-striped  writes run at ~85MB/s.
> When I run three "unstriped" client dd's concurrently and all IO goes
> to the same server, total thruput drops to ~50MB/s with each client
> getting a third of the total (lots of disk thrashing). The dd test is
> "dd if=/dev/zero of=/mnt/cluster/testX bs=64k count=64k". I just add
> the .img to get the file striped.

That sounds about right to me for a single disk (which is your problem). 
  85MB/s for sequential writes to a SATA drive, and 50MB/s for 
non-sequential (it's going to seek to three different locations on the 
platter over and over again as blocks for each write operation are written).

I'm not entirely familiar with the ALU scheduler, but it sounds like the 
statistics are only being updated every 10 seconds in your config 
(alu.stat-refresh.interval 10sec).  Thus, the best match to the 
scheduler will remain the best match for 10 seconds, and all writes done 
in that interval will go to the same unify member.

For testing, "alu.stat-refresh.num-file-create 1" might give you what 
you are looking for (it should behave almost round-robin at this point 
from what I understand), although it might generate a lot of overhead if 
used in production.

-- 

-Kevan Benson
-A-1 Networks