[Gluster-devel] Re: AFR load-balancing

Tue Nov 20 13:52:45 UTC 2007

On Mon, 19 Nov 2007, Krishna Srinivas wrote:

>> or even getting the client to choose another server than
>> the first I would get happy, this way I could scale the throughput when
>> when adding more servers. (But I could just let the clients read from them
>> selves in the mean time.)
>
> This can be done with "option read-subvolume" option in AFR right? Am I
> missing something? You can load AFR on the client side and configure
> each client for read from different subvol.

Yes, with "option read-subvolume a_local_disk_subvolume" this would work 
fine. But for clients without local disks that are part of the 
AFR-translator subvolumes I would need "option read-subvolume *" and a 
scheduler that does not pick the same server for every read to a single 
file. (For a high throughput, when I'm reading from the same file on many 
clients.)  Striping will probably work fine here as well, and all some of 
the schedulers used with the union-translator - round robin, ALU and maybe 
even the random scheduler.

But another favourite workload of mine (reading the same file from many 
servers with random access where each request is 128-256 kBytes) would be 
excellent to (like you are planning to introduce) stripe among the servers 
so that parts of the data fits in the disk cache (or the io-cache 
translator) on the servers.

>> Writes are slow, (since a client need to write to all servers, but 
>> perhaps is it possible to stack the afrs on the server side and let the 
>> servers do the replicating when writing... Hmmm...)
>
> Are you using write-behind?

Yes. But it was something else I had in mind. I'll try to explain.

When reading from many nodes an afr-translator on the client with say 4 
subvolumes are fine, since the data may be read directly from a single 
node (which may stripe on two or more internal drives) and would fill up 1 
Gbit/s to the client.

But a single client may only write at 1000/8/4 = 31 MByte/s since all data 
will be sent four times, one time to each node.

So my thought was just if it is possible to combine the performance of a 
server based afr-translator for writes and client-based afr-translator for 
reads. I guess mounting the same files twice, once in a file system with a 
server based AFR-translator (write) and once with a client based 
AFR-translator (read) would get good throughput.

Regards,
Jerker Nyberg.