[Gluster-devel] AFR filesystem inconsistency?
mogulguy at yahoo.com
Sun Feb 17 18:40:10 UTC 2008
--- Krishna Srinivas <krishna at zresearch.com> wrote:
> Since selfheal is done only on demand this issue is
> seen. However you can get around this problem if
> after bringing up the downed server you do
> a "find . > /dev/null" from the root directory
> (which would call lookup() on every directory,
> lookup() code has the code to fix the issue you are
Thanks, that comfirms what I have seen. The find is a
more complete solution than just ls, but this is a
client side solution that requires the client to even
know that a server has gone down and come up. How
would a client know this? I guess a server could be
scripted to mount itself as a client when in comes up
and automatically run a find on its client view to
sync up. This would reduce the amount of time that
two servers were out of sync, but it still would not
Inconsistencies cal still occur if server A goes
down and server B get written to and the B goes down.
When A comes up it will not have B's latest changes if
B is still down. Is preventing this in the works? If
so, I am curious as to what type of mechanism will be
used? Is there a feature "term" used to describe
this, it does not seem to be implied by the term
'self-healing' which seems slated for 1.4. Would this
be a more advanced feature, what would it be called
and when is it planned for?
In the meantime, would it be possible to script
something that would ensure this? Would there be a
hook somewhere to find out when data hasn't been
written to all the nodes because one of them is down?
If so it seems like it would be possible to script
things so that such a node (one without the latest
data) would not join a cluster unless other nodes
which have the latest data are alive also.
Looking for last minute shopping deals?
Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping
More information about the Gluster-devel