AFR load-balancing (was:Re: [Gluster-devel] Full bore.)

Krishna Srinivas krishna at
Fri Nov 16 18:38:31 UTC 2007

On Nov 16, 2007 11:48 PM, Székelyi Szabolcs <cc at> wrote:
> Krishna Srinivas wrote:
> > load balancing is checked-in in the latest patch 562, but i have not tested
> > it for performance. you can give "option read-schedule on/off" in
> > afr volume, by default it is on. If it is off, reads are done from first
> > child (the way it was done before)
> IMHO the default should be off, so after upgrading, you get the same
> behavior with the same config file.
> > But we were thinking "option read-node" would be better than
> > "option read-schedule on/off"
> >
> > "option read-node <sobvol>" will read from that particular subvol
> > (this will help when that subvol is local storage)
> > "option read-node *" will load balance from all children.
> > So option is either * or one of the subvols.
> > default is "option read-node *"
> > What do you think about it?
> What about explicitly listing the subvolumes to perform load-balancing
> between? For example, "option read-nodes child1 child3" could cause AFR
> to load-balance reads between child1 and child3, while writing to all
> its children (child[1-3], say). The default option could be to read from
> the first child. "option read-nodes *" could still mean to use all
> subvolumes.

patch 563 now uses "option read-node" and not "option read-schedule"
We thought about "option read-node child1,child3" but this wont really
be used in the sense that users would want to read either from local
storage or all the children. But there might be a case where one of the
sobvols might be a low end machine where we dont want to schedule
reads. Still thinking if it is worth implementing this particular feature.

> What about defining schedulers to decide which child to read from,
> similar to unify file placement schedulers? Or even re-using the same
> schedulers here? ;-)

Actually, if a file is read from a node, if another application opens and
reads it, reads should be done from the same node to take advantage
of the server side io-caching and server side kernel caching. As of now
we just hash the inode number to decide on the read-node, a plain and
simple mechanism.
(We can think about schedulers similar to the ones in unify, but they can't
be re-used, but conceptually they are similar, unify would schedule based
on the disk space and afr would schedule based on CPU utilization.
In future versions though. Suggestions welcome :) )

It is better to enable scheduling by default.


> Cheers, and thanks for the feature, we were excited to have it.
> --
> cc
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at

More information about the Gluster-devel mailing list