[Gluster-users] I'm new to Gluster, and have some questions
Horacio Sanson
hsanson at gmail.com
Tue Oct 26 08:52:28 UTC 2010
On Friday 22 October 2010 23:37:32 Daniel Mons wrote:
> On Fri, Oct 22, 2010 at 10:55 AM, Horacio Sanson <hsanson at gmail.com> wrote:
> > Distributed volume: Aggregates the storage of several directories
> > (bricks in gluster terms) among several computers. The benefit is that
> > you can grow/shrink the volume as you please. The bad part is that
> > this offers no performance/reliability guarantees as files are stored
> > randomly among the disks in the volume.
> >
> > Replicated volume: Requires minimum 2 bricks in separate servers. All
> > files are replicated among the bricks. How many replicas can be
> > configured at volume creation. Has all the benefits of a Distributed
> > volume plus fail resilience.
> >
> > Stripe volume: Requires minimum 2 bricks in separate servers. All files
> > are splitted in stripes and these stripes are distributed among the
> > bricks of the volume. How many stripes and which size is configured on
> > volume creation. Has all the benefits of Replicated volume plus
> > reliability and can improve read performance for large files as the read
> > is distributed among several machines.
>
> 2 comments:
>
> 1) Stripe by itself offers no redundancy. You mention that it has
> "all the benefits of replication" - it actually doesn't. If you use
> only stripe and lose a brick, your data is corrupt (say you have 4
> nodes and 1 is lost, you only have 3/4 of every file stored, which is
> pretty useless to you). Consider this something akin to RAID0.
>
Please correct me if I am wrong (surely I am) but somewhere in the
documentation about GlusterFS (3.0.x) it mentions the minimum number of
servers required for stripe storage is four. I assumed this requirement was
because this storage would store the stripes in some sort of code (network
coding?) that would allow the reconstruction of one of the stripes from the
other three in case one brick failed.
BTW: the docs on Gluster3.1 are scarse compared with the 3.0.x version. There
are a lot of translators and optimization stuff don't know how to configure in
3.1.
> 2) You can, however, mix and match these translators to your
> convenience. I'm designing a site at the moment where pairs of nodes
> are set up in replicate, and then overall all data is striped over
> each replicate pair. This is somewhat like the concept of RAID10.
>
> To answer the original poster's question of "how does the data spread
> itself?", well that's up to you. My design is to have replicate
> pairs, and stripe across many of these. You could instead do the
> reverse, and have striped pairs which all data would replicate over.
> If you think about it, the latter ends up with less usable storage and
> no real speed gain. The former ensures that as new storage bricks are
> added, data is striped across more pairs, and the overall speed
> benefit is greater.
>
> One thing to consider also is that striping means your data is broken
> into chunks and spread around the cluster. Should something go awry
> (either physically or logically), then your data could potentially be
> lost. The "distribute" translator is slightly safer in this regard.
> If worst comes to worst and you suffer either a logical or physical
> error destroying part of your data, it's a simple task to just
> manually mount up the underlying file system and recover at least some
> of your data (as bricks store only whole files).
>
> With that in mind, the "stripe" translator is best suited to sites
> where very large files are accessed frequently by many clients. I'm
> planning it for a site where a few 1TB files need to be read in by 30
> clients quasi-simultaneously. Starting each client off at slightly
> different times (even a few seconds apart) means they should
> theoretically be reading different chunks from different bricks, and
> the overall bandwidth of the cluster will not bottleneck at any one
> point. Compare this to a single NFS server with all 30 clients
> smashing it for the same file, and GlusterFS with stripe is clearly a
> better option.
>
> If your site has many clients accessing relatively small files (even
> up to a few hundred MB each) in an ad-hoc fashion, then "distribute"
> is a much safer bet. You'd most likely end up with as good
> performance as "stripe" site-wide, and have the added benefit of being
> able to manually recover files from a brick should something go wrong.
>
> "Distribute" is certainly my pick for your average business that has
> lots of unstructured data in the form of documents, images and the
> like. Ditto for large file stores for things like web farms and
> whatnot. As above, I'd only consider stripe where VERY large files
> are accessed by many clients at the same time, and speed is of the
> essence.
>
> -Dan
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
--
regards,
Horacio Sanson
More information about the Gluster-users
mailing list