[Gluster-devel] recovery

Krishna Srinivas krishna at zresearch.com
Tue Mar 6 19:45:29 UTC 2007


Hi Mic,

> If you are using a failover pair, is it possible to just copy all files
> from it's mirror and bring a unit back online? I realize any file
> changes will result in errors but is this reasonable?
> Is there an ETA on the resync tool?

You can use rsync for now. We will definitely have a tool to
do this job (it can be tracked from road map wiki page)

>
> Another failover-related question we had was : are there any plans to
> all a parity unit to the Stripe Translator (A raid 3 scheme) ? It seemed
> the next logical progression.

We do not have any plans as of now.

>
> Thanks for all the great work!
> -Mic
>
>
> Krishna Srinivas wrote:
>> Hi Christpher,
>>
>> On 3/6/07, Christopher Hawkins <chawkins at veracitynetworks.com> wrote:
>>> The last fellow to post mentioned recovery... I have a question also:
>>> If I
>>> had several storage servers and a number of clients accessing them,
>>> and I
>>> were to lose a storage server, how best to bring it back online? I
>>> would be
>>> using AFR to keep multiple copies of all files, so I know the cluster
>>> will
>>> not lose data. But when the node goes down, does the AFR translator
>>> figure
>>> out by itself that instead of the 3x copies I specified, there are
>>> now only
>>> 2x because I lost a storage node? Or does it only evaluate that at file
>>> creation time?
>>
>> AFR is nothing but implementation of open, read, write, getattr etc
>> calls
>> It calls these functions on its children, if the child is down, the
>> function
>> (from protocol/client) returns ENOTCONN to AFR which is ignored.
>> So AFR does not care if a child is down/up, it is up to the child
>> translator
>> to pass on these calls to the servers if they are up.
>>
>>> And when I bring the storage node back, say it takes me two
>>> days to fix it, I assume I should probably wipe the drives so as not to
>>> introduce old copies of files that are now out of date (or does AFR
>>> update
>>> them)? And the ALU scheduler will start using the blank space more
>>> heavily
>>> for new writes, because it is preferred as "less used" and the
>>> storage use
>>> will eventually even out again?
>>
>> As of now we do not have any tool to get the new machine to be updated
>> with
>> other AFR servers. It is on our task list.
>>
>>>
>>> Thanks for any answers!
>>> Chris
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org
>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>







More information about the Gluster-devel mailing list