[Gluster-devel] GlusterFS Spare Bricks?

Anand Babu Periasamy ab at gluster.com
Wed Apr 11 06:13:11 UTC 2012


On Tue, Apr 10, 2012 at 1:39 AM, 7220022 <7220022 at gmail.com> wrote:
>
> Are there plans to add provisioning of spare bricks in a replicated (or distributed-replicated) configuration? E.g., when a brick in a mirror set dies, the system rebuilds it automatically on a spare, similar to how it’d done by RAID controllers.
>
>
>
> Nor would it only improve the practical reliability, especially of large clusters, but it’d also make it possible to make better-performing clusters off less expensive components. For example, instead of having slow RAID5 bricks on expensive RAID controllers one uses cheap HBA-s and stripes a few disks per brick in RAID0 – that’s faster for writes than RAID 5/6 by an order of magnitude (and, by the way, should improve rebuild times in Gluster many are complaining about.).  A failure of one such striped brick is not catastrophic in a mirrored Gluster – but it’s better to have spare bricks standing by strewn across cluster heads.
>
>
>
> A more advanced setup at a hardware level involves creating “hybrid disks” whereas HDD vdisks are cached by enterprise-class SSD-s.  It works beautifully and makes HDD-s amazingly fast for random transactions.  The technology’s become widely available for many $500 COTS controllers.  However, it is not widely known that the results with HDD-s in RAID0 under SSD cache are 10 to 20 (!!) times better than with RAID 5 or 6.
>
>
>
> There is no way to use RAID0 in commercial storage, the main reason being the absence of hot-spares.  If on the other hand the spares are handled by Gluster in a form of (cached hardware-RAID0) pre-fabricated bricks both very good performance and reasonably sufficient redundancy should be easily achieved.

Why not use "gluster volume replace-brick ..." command. You can use
external monitoring/management tools (eg. freeipmi) to detect node
failures and trigger replace brick through a script. GlusterFS has the
mechanism for hot spare, but the policy should be external. A node may
come back online in 5 mins, GlusterFS should not automatically make
decisions.  I am thinking if it makes sense to add hot-spare as a
standard feature, because GlusterFS detects failures.
--
Anand Babu Periasamy
Blog [ http://www.unlocksmith.org ]
Twitter [ http://twitter.com/abperiasamy ]

Imagination is more important than knowledge --Albert Einstein




More information about the Gluster-devel mailing list