[Gluster-users] Questions from an Ignoramus
Doug Schouten
dschoute at sfu.ca
Thu Feb 10 01:36:48 UTC 2011
Hi all,
I am a grad student setting up a new cluster in our research group. We
already have five nodes each with 5 x 1 TB disks in a RAID-5 array.
Currently we just export this disk using NFS (/cluster/data0[1-5]). This
is already kind of bothersome because one needs to remember which of
five NFS mounts contains a dataset of interest.
Now we are getting four new nodes with faster disk (12 x 600 GB array,
each @ 15K RPM), and would like to merge these (at least) into a global
filesystem, and even possibly add the existing disk.
GlusterFS looks very promising, especially because it doesn't need to
take over the filesystem, and the configuration looks relatively simple
(compared to GPFS or Lustre).
However, I am having trouble tracking down a detailed explanation of how
it works, so that I can see where the weak-points are. The installation
guide on the Wiki was a good starting point to get a very basic
understanding, but I am totally unaware of a detailed explanation of
configuration options &c.
Does some sort of manual exist?
Also, how robust is GlusterFS? We probably want to stripe the data to
improve performance, but if a server dies, does the file catalogue go
with it, resulting in total data loss? Or does the meta-data get
replicated somehow so that one can recover the partial files?
Any helpers, including pointers to existing configurations that I can
learn from would be ideal.
kind regards,
Doug Schouten
p.s. to describe our needs more fully, our data-sets consist of many
files on the order of 100 - 200 MB in size. Typically we write files
once (retrieve from a central collaboration server) and read many times
as we tune an analysis, so read speed is much more important than write
performance. Redundancy is not a huge concern, since most of our data is
replicated at remote sites anyway ... although stability is still a
consideration because re-fetching the data takes O(days). The machines
are connected by dual-bond 1Gb ethernet. Latency is probably not an
issue since they are all connected on an internal switch in the same rack.
More information about the Gluster-users
mailing list