[Gluster-devel] Re; Load balancing ...

Martin Fick mogulguy at yahoo.com
Thu May 1 17:16:20 UTC 2008


--- gordan at bobich.net wrote:
> OK, you're right. I follow what you mean now. Not
> keeping all versions, just keep the version, 
> start,finish) pointers in the log. That way a 
> client can see what happened since it's version and
> only sync the blocks listed, and if the log has 
> rolled over, then (r)sync the whole file. 

Yes

> Sounds like a good idea. The next question is where
> to keep the log. 1 log per file? 1 log per
directory?
> How to store them? Shadow files? Separate 
> shadow volume? A shadow volume might be a good idea
> because it keeps the  main source mounted directory 
> exactly the same as a normal directory. 

I would start as simple as possible and adapt as
necessary if you run into a performance problem.  The
simplest design would probably be a shadow volume with
one log per file with the a sparse mirrored directory
structure.  Logs could be 24(?) bytes concatenated one
after another making appending easy and reliable.  Or
at a minor space cost (but potential added
portability/extendability), each log file could even
be a colon delimited line based ascii file (please
don't anyone suggest an xml file!)

  version1:start2:span2
  version2:start2:span2
  ...

Having a separate log file for each real file also
makes it easy to code up some optimizations, for
example: it would be easy to lookup the size of the
log and the size of the real file.  As soon as the log
becomes bigger than the real file it is no longer
worth keeping as is!  It also makes it real easy to
just delete the log if the real file is deleted.

Another nice optimizer could make intelligent
decisions about which log files to delete when the
shadow volume starts to fill up.  By simply examining
the size of each log versus the size of the real file
one can set an upper bounds on how much transfer data
the log could be saving (a real estimate would require
adding all the spans together in the log file taking
into account overlapping sections).  Finally, it would
allow an admin to prune the shadow volume manually of
whichever logs he chooses to prune.  An ascii file
would make it easy to script various pruners.

It would be nice to design the shadow volume so that
it can be removed from the picture at any time without
corrupting anything.  It would also be nice to ensure
that the journal translator can handle an out of space
condition.  This way each server is not required to
even have the same size journal volume if any at all.

> A (shadow volume) log should, ideally, also keep
> additional sanity check information such as file 
> metadata (timestamps, size) for cross-check of 
> whether something went weird and the file was
> changed underneath GlusterFS, and if it has, flush 
> out the log and force a full resync on the file.

Hmm, this seems like an additional layer that might be
nice (and perhaps an XML log would be appropriate
here), but I would put it an separate inline
translator so that it is not required.  The nice part
is that if the protocol is extended to handle the
journal layer, adding another separate layer like this
would probably be easy!

Thanks again for your patience, I know it's not easy
listening to back seat designers :)

-Martin



      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ





More information about the Gluster-devel mailing list