[Gluster-devel] AFR write completion? AFR read redundancy?
avati at zresearch.com
Tue Mar 4 04:43:35 UTC 2008
> > > Will self healing prevent an inconsistent cluster
> > > from happening? I.E. Two node cluster, A+B.
> > >
> > > 1) Node A goes down
> > > 2) Write occurs on Node B
> > > 3) Node B goes down (cluster is down)
> > > 4) Node A comes up -> cluster is inconsistent
> > since B
> > > is not yet available. Cluster should still be
> > "down".
> > This is not assured to work. The intersection of the
> > two subsets of subvolumes before and after a group
> > (subset) of nodes are added or removed should not be
> > empty.
> Hmm, that is what I feared! Are there any plans to
> ensure that this condition is met? Without this, how
> do people currently trust AFR? Do they simply assume
> that their cluster never cold boots?
> Since it does not sound like self healing will ensure
> cluster consistency, is there another planned
> task/feature that will? If not, is it because it is
> viewed as impossible/difficult? It seems like in the
> extreme case it would be at least simple enough to
> track and prevent, wouldn't it?
> Once a cluster is up and running, any remaining
> running nodes should be consistent? So it seems like
> the tricky part is dealing with cold boots (when no
> running consistent cluster exists.)
Cold boot is not a problem. Let me explain with an example
1. Node A and Node B are UP
2. Node B goes down
3. Node A gets changes
4. Node A goes down
5a. Node A and B comes back together - no problem
5b. Node A alone comes back - no problem
5c. Node B alone comes back - potential problem if same files or directories
changed in step 3 are accessed.
5d. Node B alone comes back and before data is accessed Node A comes back
too - no problem.
supporting 5c requires quite a bit of new framework code which is currently
not in our highest priority. are the above restrictions unacceptable in your
More information about the Gluster-devel