[Gluster-devel] Unfair scheduling in unify/AFR
Székelyi Szabolcs
cc at avaxio.hu
Tue Nov 20 12:50:35 UTC 2007
Krishna Srinivas wrote:
> In case you have not put io-threads, can you test it with that and see
> how it behaves?
I'm using io-threads on client side. At this time, only one client
accesses a storage brick at a time (on a given server), so I thought
io-threads won't help there. But on client side, waiting for a read for
one thread shouldn't block the whole client (because there can be more
threads), so I loaded io-threads on the client.
> When you copy, are the source and destination files both on the glusterfs
> mount point?
No, since we are testing pure read performance, and only GlusterFS
performance. So I copy sparse files over GlsuterFS into /dev/null with
`dd bs=1M`.
Currently, a single and only thread has a read performance about 540-580
MB/s. What I would like to see is two threads reading two files from two
servers, with a performance of at least 540 MB/s *each*.
> Can you mail the client spec file?
Sure. Here it goes.
### DATA
volume data-it27
type protocol/client
option transport-type ib-verbs/client
option remote-host 10.40.40.1
option remote-subvolume data
end-volume
volume data-it28
type protocol/client
option transport-type ib-verbs/client
option remote-host 10.40.40.2
option remote-subvolume data
end-volume
volume data-it29
type protocol/client
option transport-type ib-verbs/client
option remote-host 10.40.40.3
option remote-subvolume data
end-volume
### NAMESPACE
volume data-ns-it27
type protocol/client
option transport-type ib-verbs/client
option remote-host 10.40.40.1
option remote-subvolume data-ns
end-volume
volume data-ns-it28
type protocol/client
option transport-type ib-verbs/client
option remote-host 10.40.40.2
option remote-subvolume data-ns
end-volume
volume data-ns-it29
type protocol/client
option transport-type ib-verbs/client
option remote-host 10.40.40.3
option remote-subvolume data-ns
end-volume
### AFR
volume data-afr
type cluster/afr
subvolumes data-it29 data-it27 data-it28
end-volume
volume data-ns-afr
type cluster/afr
subvolumes data-ns-it27 data-ns-it28 data-ns-it29
end-volume
### UNIFY
volume data-unify
type cluster/unify
subvolumes data-afr
option namespace data-ns-afr
option scheduler rr
end-volume
volume ds
type performance/io-threads
option thread-count 8
option cache-size 64MB
subvolumes data-unify
end-volume
volume ds-ra
type performance/read-ahead
subvolumes ds
option page-size 518kB
option page-count 48
end-volume
Thanks,
--
Szabolcs
> On Nov 20, 2007 1:24 AM, Székelyi Szabolcs <cc at avaxio.hu> wrote:
>> Hi,
>>
>> I use a configuration with 3 servers and one client, with client-side
>> AFR/unify.
>>
>> It looks like the unify and AFR translators (with the new load-balancing
>> code) do unfair scheduling among concurrent threads.
>>
>> I tried to copy two files with two concurrent (ie. parallel) threads,
>> and one of the threads always gets much more bandwidth than the other.
>> When the threads start to run, actually only one of them get served by
>> the GlusterFS client at a reasonable performance, the other (almost)
>> starves. When the first thread finishes, comes the other one.
>>
>> The order of the threads seems constant over consecutive runs.
>>
>> Even more, a thread started when one thread is already running, the
>> second one can steal performance from the first.
>>
>> The preference of the threads is determined by the remote server. (I
>> mean a thread served by a particular host always gets more performance
>> than another one. This is how a thread started later can steal
>> performance from the other.)
>>
>> Doing the same thing with two GlusterFS clients (mounting the same
>> configuration on two different directories) gives absolutely fair
>> scheduling.
>>
>> The trouble with this is that this way one can't benefit from AFR
>> load-balancing. We would like to exceed the physical disk speed limit by
>> spreading the reads over multiple GlusterFS servers, but they cannot be
>> spread this way; only one server does the work at a given point in time.
>>
>> Do you have any idea what could be wrong and how to fix it?
>>
>> Thanks,
>> --
>> Szabolcs
More information about the Gluster-devel
mailing list