[Gluster-devel] recovery

Krishna Srinivas krishna at zresearch.com
Tue Mar 6 03:57:14 UTC 2007


Hi Christpher,

On 3/6/07, Christopher Hawkins <chawkins at veracitynetworks.com> wrote:
> The last fellow to post mentioned recovery... I have a question also: If I
> had several storage servers and a number of clients accessing them, and I
> were to lose a storage server, how best to bring it back online? I would be
> using AFR to keep multiple copies of all files, so I know the cluster will
> not lose data. But when the node goes down, does the AFR translator figure
> out by itself that instead of the 3x copies I specified, there are now only
> 2x because I lost a storage node? Or does it only evaluate that at file
> creation time?

AFR is nothing but implementation of open, read, write, getattr etc calls
It calls these functions on its children, if the child is down, the function
(from protocol/client) returns ENOTCONN to AFR which is ignored.
So AFR does not care if a child is down/up, it is up to the child translator
to pass on these calls to the servers if they are up.

> And when I bring the storage node back, say it takes me two
> days to fix it, I assume I should probably wipe the drives so as not to
> introduce old copies of files that are now out of date (or does AFR update
> them)? And the ALU scheduler will start using the blank space more heavily
> for new writes, because it is preferred as "less used" and the storage use
> will eventually even out again?

As of now we do not have any tool to get the new machine to be updated with
other AFR servers. It is on our task list.

>
> Thanks for any answers!
> Chris
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>





More information about the Gluster-devel mailing list