[Gluster-devel] Some FAQs ...

Wed Apr 25 15:17:18 UTC 2007

Hi Steffen,
  answers inline.

On Wed, Apr 25, 2007 at 04:15:00PM +0200, Steffen Grunewald wrote:
> Hi,
> 
> I'm in the process of evaluating parallel file systems for a cluster made
> of 15 storage servers and about 600 compute nodes, and came across GlusterFS.
> Having read most of the documentation, I've got some more FAQs I couldn't
> find in the Wiki. I'd appreciate any answer...
> 
> - The two example configs are a bit confusing. In particular, I suppose I
> 	don't have to assign different names to all 15 volumes? Different
> 	ports are only used to address a certain sub-server?

are you referring to differnt protocl/client volume names in the
client spec file? if so, yes, each volume for a server should have a
seperate name. there can only be one volume with a given name in a
graph (read: spec file)

> - This would mean I could use the same glusterfs-server.vol for all 
> 	storage bricks?

yes, the same glusterfs-server.vol can be used for all the servers.

> - The "all-in-one" configuration suggests that servers can be clients at the
> 	same time? (meaning, there's no real need to separately build
> 	server and client)

the same machine can run the glutserfs server and the client. 

> - The instructions to add a new brick (reproduce the directory tree with 
> 	cpio) suggest that it would be possible to form a GluFS from 
> 	already existing separate file servers, each holding part of the
> 	"greater truth", by building a unified directory tree (only
> 	partly populated) on each of them, then unifying them using
> 	GluFS. Am I right?

you are right!

> - Would it still be possible to access the underlying filesystems, using
> 	NFS with read-only export?

will be possible.

> - What would happen if files are added to the underlying filesystem on one
> 	of the bricks? Since there's no synchronization mechanism this should
> 	look the same as f the file entered through GluFS?

it would look as if the file entered through glusterfs. but the unify
translator expects a file to reside on only one server. if you are
careful enough to add it by avoiding race conditions (where another
glusterfs client creates the same file at the same time on another
server) it would work. but it may not work once the name-space-cache
translator (comes in the next release) is in use.

> - What's the recommended way to backup such a file system? Snapshots?

the snapshot translator is in the roadmap as well. for now the
recommended way to backup is to take the filesystem offline (umount
all clients) and rsync all the servers.

> - Is there a Debian/GNU version already available, or someone working on it?

I recently saw a post about someone working on it -

http://people.debian.org/~terpstra/message/20070418.192436.787e9c06.en.html

> - Are there plans to implement "relaxed" RAID-1 by writing identical copies
> 	of the same file (the same way AFR does) to different servers?

I do not quite understand what difference you are asking from the
current AFR? do you mean relaxed as in, make the copy after the file
is closed? please exlplain more clearly.

> _ I couldn't find any indication of metadata being kept somewhere - how do
> 	I find out which files were affected if a brick fails and cannot
> 	be repaired? (How does AFR handle such situations?) I suppose there
> 	are no tools to re-establish redundancy when slipping in a fresh
> 	brick - what's the roadmap for this feature?

in the current relase there is no way to know. but the distributed
name space cache translator will not only keep the namespace alive,
but also will  provide mechanisms to know which are the missing
entires in the namespace (hence files gone with the dead server).

the AFR in the 1.4 release will have ways to replay the changes to a
dead server since it went down. 

> - In several places, the FAQ refers to "the next release" for certain 
> 	features - it would make sense to put the release number there.

right, ok :)

> - The benchmark GluFS vs. Lustre looks almost too good - what was the
> 	underlying filesystem on the bricks? Don't the results reflect
> 	the big (6GB) buffer cache instead of the real FS performance?

the underlying filesystem was ext3. the 'creamy' performances of any
filesystem for all pracitcal purposes the performance with buffer
caching. for unbuffered I/O clearly the disk sppeed becomes the
bottleneck. since the hardware was same for both glusterfs and lustre
both had equal buffer cache and disk to exploit.

> More to come...

awaiting :)

avati

> Cheers,
>  Steffen
> 
> -- 
> Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
> Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
> * e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
> No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> 

-- 
ultimate_answer_t
deep_thought (void)
{ 
  sleep (years2secs (7500000)); 
  return 42;
}