[Gluster-devel] 答复: How to repair a 1TB disk in 30 mins

Wed May 9 09:29:34 UTC 2012

Thank you very much！

And I have some questions。
1、What's the capacity of the largest cluster online ?And how many nodes in it? And What is it used for?
2、When we excute 'ls' in a directory,it's very slow,if the cluster has too many bricks and too many nodes.Can we do it well?

-----邮件原件-----
发件人: Anand Babu Periasamy [mailto:abperiasamy at gmail.com] 
发送时间: 2012年5月9日 15:55
收件人: renqiang
抄送: gluster-devel at nongnu.org
主题: Re: [Gluster-devel] How to repair a 1TB disk in 30 mins

On Tue, May 8, 2012 at 9:46 PM, 任强 <renqiang at 360buy.com> wrote:
> Dear All:
>
>   I have a question. When I have a large cluster, maybe more than 10PB data,
> if a file have 3 copies and each disk have 1TB capacity, So we need about
> 30,000 disks. All disks are very cheap and are easily damaged. We must
>  repair a 1TB disk in 30 mins。As far as I know，in gluster architecture，all
> data in the damaged disk will be repaired to the new disk which is used to
> replace the damaged disk. As a result of the writing speed of disk, when we
> repair 1TB disk in gluster, we need more than 5 hours. Can we do it in 30
> mins?

5 hours is based on SATA 1TB disk copying at ~50MB/s across small and
large files + folders. This means, you literally attached the disk to
the system and manually transferring the data. I can't think of any
other faster way to transfer data on 1TB 7200RPM SATA/SAS disks
without bending space-time ;).  Larger disks and RAID arrays only
makes this worse. This is exactly why we implemented passive self-heal
in the first place. GlusterFS heals files on demand (as they are
accessed), so applications have least down time or disruption. There
is plenty of time  to heal the cold data in background. All we should
care is minimal down time.

Self-heal in 3.3 has some major improvements. It got significantly
faster, because healing is performed on the server side entirely
(server to server). It can perform granular healing on large files
(previously checksum operations used to pause or timeout the VMs).
Active-healing (Replicate now remembers pending files and heals them
when the failed node comes back. Previously you have to perform
name-space wide recursive directory listing). Most importantly
self-healing is no longer a blackbox. heal-info can show pending and
currently-healing files.

--
Anand Babu Periasamy
Blog [http://www.unlocksmith.org]

Imagination is more important than knowledge --Albert Einstein