[Gluster-devel] Feature requests of glusterfs
LI Daobing
lidaobing at gmail.com
Wed Jan 2 09:13:20 UTC 2008
Hello,
Thanks for your great job on GlusterFS first.
Here are some feature requests. First are several small feature requests:
1. add `local-volume-name' option to AFR, which can support read from
the local volume (if available).
2. nufa scheduler support more than one `local-volume'. Sometimes, more
than one child of unify is local. In this case, it's valuable to set
more than local-volume, and use them randomly, or in turn, or use the
second after the first is full.
3. reduce the effect of network latency in afr. currently the afr write
the data to the children in serial. So the speed is heavily affected
by the latency of the network. How about add a new kind of afr, which
is the combination of afr and io-threads. In the new xlator, each
child is running in the separate thread, so the several send process
is running at the same time. So the speed is affected by the network
latency only one time (instead of several times).
3. reduce the effect of network latency in afr. currently the afr write
the data to the children in serial. So the speed is heavily affected
by the latency of the network. How about add a new kind of afr, which
is the combination of afr and io-threads. In the new xlator, each
child is running in the separate thread, so the several send process
is running at the same time. So the speed is affected by the network
latency only one time (instead of several times).
The last feature request is a little larger
4. a new AFR model:
Currently, if the AFR have 3 child xlators, and each xlator connect to
a distinct machine. Then the write speed of this AFR is only 33% of the
capacity of the network.
Consider a different model, the AFR send data to machine1, machine1 send
the data to machine2 immediately and then write the data to
disk. Machine2 also send data to machine3 immediately and then write
data to disk. Under this model, we can increase the write speed to 3
times of the previous model (if your switch is good enough and your
network support full duplex).
In more detail, we need two new kinds of xlators. The first one is the
combination of the AFR and client-protocol (called *safr*). The second
one is similar with the `server-protocol'(called *sserver*).
The machine1, machine2, machine3 is set in the option of safr. And safr
maintain an active-machine list. When safr receive a writev command(or
other commands), it pick a machine from the active-machine list(for
exmaple, machine1). then send the data and a list "[machine2, machine3]"
to machine1. machine1 forward data and list "[machine3]" to machine2
immediately, machine2 also forward data and an empty list to
machine3. Machine1, machine2, machine3 also write the data to the disk
when sending data.
If any machine is down, the afr just remove it from the active-machine
list. And add it when the machine is up again.
This model is a little far from the current framework, but I think it's
a good idea to write at 100 MB/s instead of 30+ MB/s in a gigabyte
network.
This model is similar with the model in google file system, you can
check the figure 2 in a paper of google file system[1]. I put a copy of
this figure at [2].
[1] http://labs.google.com/papers/gfs-sosp2003.pdf
[2] http://picasaweb.google.com/lidaobing/Public/photo#5150803669289886370
PS, should I copy this feature request to wiki? Or it's ok to only put
it here?
Thanks,
LI Daobing
More information about the Gluster-devel
mailing list