[Gluster-devel] Re; Load balancing ...

Mon Apr 28 18:37:01 UTC 2008

--- Krishna Srinivas <krishna at zresearch.com> wrote:

> Gareth:
>     Moving to the other end of the scale, AFR can't
> cope with large files either .. handling of sparse 
> files doesn't work properly and self-heal has no 
> concept of repairing part of a file .. so sticking a
> 20Gb file on a GlusterFS is just asking for trouble
> as every time you restart a gluster server (or every

> time one crashes) it'll crucify your network.
> 
> Krishna:
> We have plans to provide rsync-type sync to AFR sync
> in future. Giving it as option as Gordon mentioned.

May I suggest an alternate approach?  The rsync model
seems like a nice one when you have no idea what the
changes are, but with the glusterfs AFR it is possible
to keep track of the changes.  What about adding a
journaling volume option to the AFR translator?  A
separate journal volume would be associated with each
subvolume.  Any time a peer subvolume cannot be
written to, the journal subvolume would record the
differences making it potentially easier to playback
differences to the peer when it returns.  Something
like this:

                       +-----+
                       | AFR |
                       +-----+
        ----------------/   \--------------
       /               /     \             \
+-----------+  +-------+   +-------+  +-----------+
| Journal A |  | Sub A |   | Sub B |  | Journal B |
+-----------+  +-------+   +-------+  +-----------+

So if changes cannot be written to Sub B they would
be recorded in Journal A.  When B comes back up and
AFR notices a mismatch between a file on Sub A and Sub
B and would normally query Sub A for the file
contents, it could query Journal A first to see if the
changes to the file are stored there.  If so, Journal
A could reply with just the changes instead of the
whole file and AFR can then apply the changes to Sub
B.

The journal volume would not actually be required and
would be space limited, it would simply drop changes
that it can no longer keep track of.  If the journal
does not have the change logged, everything would
proceed as it does today, the subvolume would be
queried for the whole file.  This would be a little
like the DRBD model, but more inline with the gluster
way of doing things.  It would be better than what
DRBD does since it would be more granular.  When space
for changes runs out, whole files might have to be
synced, but not necessarily the whole filessytem!

I realize that this a major enhancement, and would be
a lot of work, but then again, so probably would the
rsync model implementation, would it not?  The
advantage here is that consistency would be assured. 
The tradeoff between the journal and the rsync model
is one of disk space for the journal versus CPU time
for the rsync model.  Certainly both could be
implemented, the journal could be queried first, and
if that fails, use the rsync method!

Thoughts?

-Martin

      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ