[Gluster-users] Advice for setup: SW RAID 6 vs JBOD

Eduardo Mayoral emayoral at arsys.es
Thu Jun 6 17:20:51 UTC 2019


Yes to the 10 GB NICS (they are already on the servers).

Nice idea with the SSDs, but I do not have a HW RAID card on these
servers or the possibility to get / install one.

What I do have is an extra SSD disk per server which I plan to use as
LVM cache for the bricks (Maybe just 1 disk, maybe 2 with SW RAID 1). I
still need to test how LVM / gluster are going to handle the failure of
the cache disk.

Thanks!

On 6/6/19 19:07, Vincent Royer wrote:
> What if you have two fast 2TB SSDs per server in hardware RAID 1, 3
> hosts in replica 3.  Dual 10gb enterprise nics.  This would end up
> being a single 2TB volume, correct?  Seems like that would offer great
> speed and have pretty decent survivability. 
>
> On Wed, Jun 5, 2019 at 11:54 PM Hu Bert <revirii at googlemail.com
> <mailto:revirii at googlemail.com>> wrote:
>
>     Good morning,
>
>     my comment won't help you directly, but i thought i'd send it
>     anyway...
>
>     Our first glusterfs setup had 3 servers withs 4 disks=bricks (10TB,
>     JBOD) each. Was running fine in the beginning, but then 1 disk failed.
>     The following heal took ~1 month, with a bad performance (quite high
>     IO). Shortly after the heal hat finished another disk failed -> same
>     problems again. Not funny.
>
>     For our new system we decided to use 3 servers with 10 disks (10 TB)
>     each, but now the 10 disks in a SW RAID 10 (well, we split the 10
>     disks into 2 SW RAID 10, each of them is a brick, we have 2 gluster
>     volumes). A lot of disk space "wasted", with this type of SW RAID and
>     a replicate 3 setup, but we wanted to avoid the "healing takes a long
>     time with bad performance" problems. Now mdadm takes care of
>     replicating data, glusterfs should always see "good" bricks.
>
>     And the decision may depend on what kind of data you have. Many small
>     files, like tens of millions? Or not that much, but bigger files? I
>     once watched a video (i think it was this one:
>     https://www.youtube.com/watch?v=61HDVwttNYI). Recommendation there:
>     RAID 6 or 10 for small files, for big files... well, already 2 years
>     "old" ;-)
>
>     As i said, this won't help you directly. You have to identify what's
>     most important for your scenario; as you said, high performance is not
>     an issue - if this is true even when you have slight performance
>     issues after a disk fail then ok. My experience so far: the bigger and
>     slower the disks are and the more data you have -> healing will hurt
>     -> try to avoid this. If the disks are small and fast (SSDs), healing
>     will be faster -> JBOD is an option.
>
>
>     hth,
>     Hubert
>
>     Am Mi., 5. Juni 2019 um 11:33 Uhr schrieb Eduardo Mayoral
>     <emayoral at arsys.es <mailto:emayoral at arsys.es>>:
>     >
>     > Hi,
>     >
>     >     I am looking into a new gluster deployment to replace an
>     ancient one.
>     >
>     >     For this deployment I will be using some repurposed servers I
>     > already have in stock. The disk specs are 12 * 3 TB SATA disks.
>     No HW
>     > RAID controller. They also have some SSD which would be nice to
>     leverage
>     > as cache or similar to improve performance, since it is already
>     there.
>     > Advice on how to leverage the SSDs would be greatly appreciated.
>     >
>     >     One of the design choices I have to make is using 3 nodes for a
>     > replica-3 with JBOD, or using 2 nodes with a replica-2 and using
>     SW RAID
>     > 6 for the disks, maybe adding a 3rd node with a smaller amount
>     of disk
>     > as metadata node for the replica set. I would love to hear
>     advice on the
>     > pros and cons of each setup from the gluster experts.
>     >
>     >     The data will be accessed from 4 to 6 systems with native
>     gluster,
>     > not sure if that makes any difference.
>     >
>     >     The amount of data I have to store there is currently 20 TB,
>     with
>     > moderate growth. iops are quite low so high performance is not
>     an issue.
>     > The data will fit in any of the two setups.
>     >
>     >     Thanks in advance for your advice!
>     >
>     > --
>     > Eduardo Mayoral Jimeno
>     > Systems engineer, platform department. Arsys Internet.
>     > emayoral at arsys.es <mailto:emayoral at arsys.es> - +34 941 620 105 -
>     ext 2153
>     >
>     >
>     > _______________________________________________
>     > Gluster-users mailing list
>     > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>     > https://lists.gluster.org/mailman/listinfo/gluster-users
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>     https://lists.gluster.org/mailman/listinfo/gluster-users
>
-- 
Eduardo Mayoral Jimeno
Systems engineer, platform department. Arsys Internet.
emayoral at arsys.es - +34 941 620 105 - ext 2153

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190606/5e409886/attachment.html>


More information about the Gluster-users mailing list