[Gluster-users] Design/HW for cost-efficient NL archive >= 0.5PB?

Tue Dec 31 21:27:16 UTC 2013

Yes, RAID-6 is better than RAID-5 in most cases.  I agonized over the
decision to deploy 5 for my Gluster cluster, and the reason I went with 5
is that the number of drives in the brick was (IMO) acceptably low.  I use
6 for my 16-drive arrays, which means I have to lose 3 disks out of the 16
to lose my data.  With 2x8-drive arrays in 5, I also have to lose 3 disks
to lose data, but if I do lose data, I only lose 50% of the data on the
server, and all these bricks are distribute-replicate anyway, so I wouldn't
actually lose any data at all.  That consideration, paired with the fact
that I keep spares on hand and replace failed drives within a day or two,
means that I'm okay with running 2x RAID-5 instead of 1x RAID-6.  (2x
RAID-6 would put me below my storage target, forcing additional hardware
purchases.)

I suppose the short answer is "evaluate your storage needs carefully."

On Tue, Dec 31, 2013 at 11:19 AM, James <purpleidea at gmail.com> wrote:

> On Tue, Dec 31, 2013 at 11:33 AM, Justin Dossey <jbd at podomatic.com> wrote:
> >
> > Yes, I'd recommend sticking with RAID in addition to GlusterFS.  The
> cluster I'm mid-build on (it's a live migration) is 18x RAID-5 bricks on 9
> servers.  Each RAID-5 brick is 8 2T drives, so about 13T usable.  It's
> better to deal with a RAID when a disk fails than to have to pull and
> replace the brick, and I believe Red Hat's official recommendation is still
> to minimize the number of bricks per server (which makes me a rebel for
> having two, I suppose).  9 (slow-ish, SATA RAID) servers easily saturate
> 1Gbit on a busy day.
>
>
> I think RedHat also recommends RAID6 instead of RAID5. In any case, I
> sure do, at least.
>
> James
>
>
>
> On Mon, Dec 30, 2013 at 5:54 AM, bernhard glomm
> <bernhard.glomm at ecologic.eu> wrote:
> >
> > some years ago I had a similar tasks.
> > I did:
> > - We had disk arrays with 24 slots, with optional 4 JBODS (each 24
> slots) stacked on top, dual LWL controller 4GB (costs ;-)
> > - creating raids (6) with not more than 7 disks each
> > - as far as I remember I had one hot spare per each 4 raids
> > - connecting as many of this raid bricks together with striped glusterfs
> as needed
> > - as for replication, I was planing for an offside duplicate of this
> architecture and
> > because losing data was REALLY not an option, writing it all off at a
> second offside location onto LTFS tapes.
> > As the original version for the LTFS library edition was far to
> expensive for us
> > I found an alternative solution that does the same thing
> > but fort a much reasonable prize. LTFS is still a big thing in digital
> Archiving.
> > Give me a note if you like more details on that.
> >
> > - This way I could fsck all (not to big) raids in parallel (sped things
> up)
> > - proper robustness against disk failure
> > - space that could grow infinite in size (add more and bigger disks) and
> keep up with access speed (ad more server) at a pretty foreseeable prize
> > - LTFS in the vault provided just the finishing having data accessible
> even if two out three sides are down,
> > reasonable prize, (for instance no heat problem at the tape location)
> > Nowadays I would go for the same approach except zfs raidz3 bricks (at
> least do a thorough test on it)
> > instead of (small) hardware raid bricks.
> > As for simplicity and robustness I wouldn't like to end up with several
> hundred glusterfs bricks, each on one individual disk,
> > but rather leaving disk failure prevention either to hardware raid or
> zfs and using gluster to connect this bricks into the
> > fs size I need(  - and for mirroring the whole thing to a second side if
> needed)
> > hth
> > Bernhard
> >
> >
> >
> > Bernhard Glomm
> > IT Administration
> >
> > Phone: +49 (30) 86880 134
> > Fax: +49 (30) 86880 100
> > Skype: bernhard.glomm.ecologic
> > Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717
> Berlin | Germany
> > GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.:
> DE811963464
> > Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH
> > ________________________________
> >
> > On Dec 25, 2013, at 8:47 PM, Fredrik Häll <hall.fredrik at gmail.com>
> wrote:
> >
> > I am new to Gluster, but so far it seems very attractive for my needs. I
> am trying to assess its suitability for a cost-efficient storage problem I
> am tackling. Hopefully someone can help me find how to best solve my
> problem.
> >
> > Capacity:
> > Start with around 0.5PB usable
> >
> > Redundancy:
> > 2 replicas with non-RAID is not sufficient. Either 3 replicas with
> non-raid or some combination of 2 replicas and RAID?
> >
> > File types:
> > Large files, around 400-1500MB each.
> >
> > Usage pattern:
> > Archive (not sure if this matches nearline or not..) with files being
> added at around 200-300GB/day (3-400 files/day). Very few reads, order of
> 10 file accesses per day. Concurrent reads highly unlikely.
> >
> > The main two factors for me are cost and redundancy. Losing data is not
> an option, being an archive solution. Cost/usable TB is the other key
> factor, as we see growth estimates of 100-500TB/year.
> >
> > Looking just at $/TB, a RAID-based approach to me sounds more efficient.
> But RAID rebuild times with large arrays of large capacity drives sound
> really scary. Not sure if something smart can be done since we will still
> have a replica left during the rebuild?
> >
> > So, any suggestions on what would be possible and cost-efficient
> solutions?
> >
> > - Any experience on dense servers, what is advisable? 24/36/50/60 slots?
> > - SAS expanders/storage pods?
> > - RAID vs non-RAID?
> > - Number of replicas etc?
> >
> > Best,
> >
> > Fredrik
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> > --
> > Justin Dossey
> > CTO, PodOmatic
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
>

-- 
Justin Dossey
CTO, PodOmatic
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131231/7fc98e93/attachment.html>