[Gluster-devel] Re; Load balancing ...
gordan at bobich.net
gordan at bobich.net
Thu May 1 10:04:31 UTC 2008
On Wed, 30 Apr 2008, Martin Fick wrote:
> No, no need to keep the old data around. We only need
> to remember the start and span of each changed section
> along with the file version of the change! This is
> much esier/space efficient than snapshots. Excuse me
> for being ignorant of the actual sizes of these three
> parameters, but they can't be larger than 8 bytes
> each, can they? 8*3 = 24 bytes. A 100MB journal
> filesystem could store almost 50 thousand different
> file changes!
OK, you're right. I follow what you mean now. Not keeping all versions,
just keep the (version,start,finish) pointers in the log. That way a
client can see what happened since it's version and only sync the blocks
listed, and if the log has rolled over, then (r)sync the whole file.
Sounds like a good idea. The next question is where to keep the log. 1 log
per file? 1 log per directory? How to store them? Shadow files? Separate
shadow volume? A shadow volume might be a good idea because it keeps the
main source mounted directory exactly the same as a normal directory. A
(shadow volume) log should, ideally, also keep additional sanity check
information such as file metadata (timestamps, size) for cross-check of
whether something went weird and the file was changed underneath
GlusterFS, and if it has, flush out the log and force a full resync on
the file.
Gordan
More information about the Gluster-devel
mailing list