[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

phil cryer phil at cryer.us
Tue Jan 5 16:21:12 UTC 2010

This is *very* helpful, thanks for taking the time Larry!  Looking
forward to giving feedback once we have the cluster up.


On Thu, Dec 17, 2009 at 11:23 AM, Tejas N. Bhise <tejas at gluster.com> wrote:
> Thanks, Larry, for the comprehensive information.
> Phil, I hope that answers a lot of your questions. Feel free to ask more, we have a great community here.
> Regards,
> Tejas.
> ----- Original Message -----
> From: "Larry Bates" <larry.bates at vitalesafe.com>
> To: gluster-users at gluster.org, phil at cryer.us
> Sent: Thursday, December 17, 2009 9:47:30 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
> Subject: Re: [Gluster-users] Gluster-users Digest, Vol 20, Issue 22
> Phi.l,
> I think the real question you need to ask has to do with why we are
> using GlusterFS at all and what happens when something fails.  Normally
> GlusterFS is used to provide scalability, redundancy/recovery, and
> performance.  For many applications performance will be the least of the
> worries so we concentrate on scalability and redundancy/recovery.
> Scalability can be achieved no matter which way you configure your
> servers.  Using distribute translator (DHT) you can unify all the
> servers into a single virtual storage space.  The problem comes when you
> look at what happens when you have a machine/drive failures and need the
> redundancy/recovery capabilities of GlusterFS.  By putting 36Tb of
> storage on a single server and exposing it as a single volume (using
> either hardware or software RAID), you will have to replicate that to a
> replacement server after a failure.  Replicating 36Tb will take a lot of
> time and CPU cycles.  If you keep things simple (JBOD) and use AFR to
> replicate drives between servers and use DHT to unify everything
> together, now you only have to move 1.5Tb/2Tb when a drive fails.  You
> will also note that you get to use 100% of your disk storage this way
> instead of wasting 1 drive per array with RAID5 or two drives with
> RAID6.  Normally with RAID5/6 it is also imperative that you have a hot
> spare per array, which means you waste an additional driver per array.
> To make RAID5/6 work with no single point of failure you have to do
> something like RAID50/60 across two controllers which gets expensive and
> much more difficult to manage and to grow.  Implementing GlusterFS using
> more modest hardware makes all those "issues" go away.  Just use
> GlusterFS to provide the RAID-like capabilities (via AFR and DHT).
> Personally I doubt that I would set up my storage the way you describe.
> I probably would (and have) set it up with more smaller servers.
> Something like three times as many 2U servers with 8x2Tb drives each (or
> even 6 times as many 1U servers with 4x2Tb drives each) and forget the
> expensive RAID SATA controllers, they aren't necessary and are just a
> single point of failure that you can eliminate.  In addition you will
> enjoy significant performance improvements because you have:
> 1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers).
> Gigabit Ethernet is fast, but still will limit bandwidth to a single
> machine.
> 2) Write performance on RAID5/6 is never going to be as fast as JBOD.
> 3) You should have much more memory caching available (36x8Gb = 256Gb
> memory or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb)
> 4) Management of the storage is done in one place..GlusterFS.  No messy
> RAID controller setups to document/remember.
> 5) You can expand in the future in a much more granular and controlled
> fashion.  Add 2 machines (1 for replication) and you get 8Tb (using 2Tb
> drives) of storage.  When you want to replace a machine, just set up new
> one, fail the old one, and let GlusterFS build the new one for you (AFR
> will do the heavy lifting).  CPUs will get faster, hard drives will get
> faster and bigger in the future, so make it easy to upgrade.  A small
> number of BIG machines makes it a lot harder to do upgrades as new
> hardware becomes available.
> 6) Machine failures (motherboard, power supply, etc.) will effect much
> less of your storage network.  Having a spare 1U machine around as a hot
> spare doesn't cost much (maybe $1200).  Having a spare 5U monster around
> does (probably close to $6000).
> IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe less)
> than the big boxes you are looking to buy.  They are commodity items.
> If you go the 1U route you don't need anything but a machine, with
> memory and 4 hard drives (all server motherboards come with at least 4
> SATA ports).  By using 2Tb drives, I think you would find that the cost
> would be actually less.  By NOT using hardware RAID you can also NOT use
> RAID-class hard drives which cost about $100 each more than non-RAID
> hard drives.  Just that change alone could save you 6 x 24 = 144 x $100
> = $14,400!  JBOD just doesn't need RAID-class hard drives because you
> don't need the sophisticated firmware that the RAID-class hard drives
> provide.  You still will want quality hard drives, but failures will
> have such a low impact that it is much less of a problem.
> By using more smaller machines you also eliminate the need for redundant
> power supplies (which would be a requirement in your large boxes because
> it would be a single point of failure on a large percentage of your
> storage system).
> Hope the information helps.
> Regards,
> Larry Bates
> ------------------------------
>> Message: 6
>> Date: Thu, 17 Dec 2009 00:18:54 -0600
>> From: phil cryer <phil at cryer.us>
>> Subject: [Gluster-users] Recommended GlusterFS configuration for 6
>>       node    cluster
>> To: "gluster-users at gluster.org" <gluster-users at gluster.org>
>> Message-ID:
>>       <3a3bc55a0912162218i4e3f326cr9956dd37132bfc19 at mail.gmail.com>
>> Content-Type: text/plain; charset=UTF-8
>> We're setting up 6 servers, each with 24 x 1.5TB drives, the systems
>> will run Debian testing and Gluster 3.x.  The SATA RAID card offers
>> RAID5 and RAID6, we're wondering what the optimum setup would be for
>> this configuration.  Do we RAID5 the disks, and have GlusterFS use
>> them that way, or do we keep them all 'raw' and have GlusterFS handle
>> the replication (though not 2x as we would have with the RAID
>> options)?  Obviously a lot of ways to do this, just wondering what
>> GlusterFS devs and other experienced users would recommend.
>> Thanks
>> P
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list