[Gluster-devel] Looking for recommendations, large setup
Steffen Grunewald
steffen.grunewald at aei.mpg.de
Fri May 30 09:47:52 UTC 2008
Hi,
after a long time of hesitation, I have decided to give glusterfs a try.
Perhaps not immediately, but in the next few weeks.
What I have as a starting point is:
- some nodes with a spare disk (single partition, xfs formatted, 750GB)
- Debian Etch
- the ability to build a few packages (lmello seems to be silent)
Now I'm looking for recommendations.
At the beginning, I'd like to put some emphasis on data availability, even
at the expense of disk space, and the option to reduce the redundancy later.
Note that I'm talking about quite some TB, and the need to find out which
files have been harmed should there be a major hardware problem. (There
can't be a full backup, but at least I'd be able to fill in missing pieces;
but I have to know who they are.)
I've been thinking about the following:
- have sets of n machines, repeated m times
- make "mirrors" from the corresponding machines in each set
that is, AFR over machine 1,(n+1),(2*n+1), ... , ((m-1)*n+1) etc.
- giving me kind of RAID-1 redundancy
- unify all these AFRs
- resulting in a RAID-10 setup
- instead of (block) striping, I'd favour a round-robin scheduling so
that each file would be written to the next AFR
- if this "single-file rr" could be limited to a filename pattern, that
would be a nice feature
My questions:
(1) I have installed the "attr" package, and in a writable directory, I can
do the "setfattr" test describes in "Best Practices" in the wiki.
May I safely assume that extended attributes won't be an issue then,
and I don't need a mount option / mkfs.xfs option?
(2) What about namespace? This shows up from time to time, and if necessary,
I'd like to have one on special servers (with hardware RAID etc.)
but still in AFR setup.
(2a) How many ns volumes can I AFR without harming performance?
(2b) What are the disk space requirements for a ns volume? Rules of thumb
to derive them from file counts etc.?
(3) How does the setup outlined above scale to large values of n*m?
(I'm thinking along the lines of n=200, m=3; with the option to
drop to m=2, n=300.)
(3a) Are there setups in the wild with more than 100 storage/posix volumes,
and what's your experience with such a large farm?
(4) What about fuse? Will the fuse module that comes with the latest kernel
(2.6.25.4) do for a start?
(4a) Would it be possible to place the patched fuse kernel module under
module-assistant's control (so I don't have to build a new package
for the package repository each time the kernel gets updated?
and - somewhat unrelated -:
(5) I can imagine to convert a couple of (RAID-6) storage servers to glusterfs
as well. These are already "mirrored" (by hand), and it should be
easy to combine them into AFR pairs, then unify the AFRs, and supply
the whole volume as read-only (! files are owned by a special user)
Are there detailed instructions how to achieve this without data loss?
Some time ago I had found some hints how to use "cpio" to duplicate
the directory structure among the individual storage volumes, is
this still necessary?
(5a) How do I add a ns volume in this case?
Oh well, that's quite a lot of questions. I'm not in a hurry yet :) so feel
free to answer (part of) them when your time allows.
Thanks in advance,
Steffen
--
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html
More information about the Gluster-devel
mailing list