[Gluster-devel] Re; Load balancing ...

gordan at bobich.net gordan at bobich.net
Thu May 1 10:04:31 UTC 2008


On Wed, 30 Apr 2008, Martin Fick wrote:

> No, no need to keep the old data around.  We only need
> to remember the start and span of each changed section
> along with the file version of the change!  This is
> much esier/space efficient than snapshots.  Excuse me
> for being ignorant of the actual sizes of these three
> parameters, but they can't be larger than 8 bytes
> each, can they?  8*3 = 24 bytes.  A 100MB journal
> filesystem could store almost 50 thousand different
> file changes!

OK, you're right. I follow what you mean now. Not keeping all versions, 
just keep the (version,start,finish) pointers in the log. That way a 
client can see what happened since it's version and only sync the blocks 
listed, and if the log has rolled over, then (r)sync the whole file. 
Sounds like a good idea. The next question is where to keep the log. 1 log 
per file? 1 log per directory? How to store them? Shadow files? Separate 
shadow volume? A shadow volume might be a good idea because it keeps the 
main source mounted directory exactly the same as a normal directory. A 
(shadow volume) log should, ideally, also keep additional sanity check 
information such as file metadata (timestamps, size) for cross-check of 
whether something went weird and the file was changed underneath 
GlusterFS, and if it has, flush out the log and force a full resync on 
the file.

Gordan





More information about the Gluster-devel mailing list