[Gluster-devel] Unfair scheduling in unify/AFR

Mon Nov 19 19:54:38 UTC 2007

Hi,

I use a configuration with 3 servers and one client, with client-side
AFR/unify.

It looks like the unify and AFR translators (with the new load-balancing
code) do unfair scheduling among concurrent threads.

I tried to copy two files with two concurrent (ie. parallel) threads,
and one of the threads always gets much more bandwidth than the other.
When the threads start to run, actually only one of them get served by
the GlusterFS client at a reasonable performance, the other (almost)
starves. When the first thread finishes, comes the other one.

The order of the threads seems constant over consecutive runs.

Even more, a thread started when one thread is already running, the
second one can steal performance from the first.

The preference of the threads is determined by the remote server. (I
mean a thread served by a particular host always gets more performance
than another one. This is how a thread started later can steal
performance from the other.)

Doing the same thing with two GlusterFS clients (mounting the same
configuration on two different directories) gives absolutely fair
scheduling.

The trouble with this is that this way one can't benefit from AFR
load-balancing. We would like to exceed the physical disk speed limit by
spreading the reads over multiple GlusterFS servers, but they cannot be
spread this way; only one server does the work at a given point in time.

Do you have any idea what could be wrong and how to fix it?

Thanks,
--
Szabolcs