[Gluster-devel] Large cluster data pool from 150 non-redundant disks?

Mon Mar 15 14:50:06 UTC 2010

Hi,

After having had a look into GlusterFS quite a while ago, I'm now planning 
to use 2.0.9 to implement a large data pool on our cluster.

I've got 150 nodes, each with a spare disk.
With GlusterFS' self-healing features, I think a triple redundancy would
be more than enough, giving me a "RAID-0" over 50 "RAID-1"s consisting
of 3 disks (bricks) each.
Since a n-fold "RAID-1" (replicate) would mean that a machine writing to
the FS would have to split its bandwidth by n, and the total disk capacity
is divided by n as well, 3 appears to be a good compromise.
Read access would be spread over all disks evenly, and I'm not worried about
that right now (it can only get better, compared to 100+ nodes trying to
access files from a single server).

Is there a catch somewhere that I don't see?

Any suggestions which translators to use (besides posix-locks on the server
side)? What's the difference between "afr" and "replicate"?

Regards,
 Steffen

-- 
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html