[Gluster-users] to RAID or not?

Mon Jul 4 18:44:33 UTC 2016

Agreed… It took me almost 2 years of tweaking and testing to get the performance I wanted.   

Different workloads require different configurations.    Test different configurations and find what works best for you!

> On Jul 4, 2016, at 2:15 PM, tom at encoding.com wrote:
> 
> I would highly stress, regardless of whatever solution you choose - make sure you test actual workload performance before going all-in.
> 
> In my testing, performance (esp. iops and latency) decreased as I added bricks and additional nodes.  Since you have many spindles now, I would encourage you to test your workload up to and including the total brick count you ultimately expect.  RAID level and whether it’s md, zfs, or hardware isn’t likely to make as significant of a performance impact as Gluster and its various clients will.  Test failure scenarios and performance characteristics during impairment events thoroughly.  Make sure heals happen as you expect, including final contents of files modified during an impairment.  If you have many small files or directories that will be accessed concurrently, make sure to stress that behavior in your testing.
> 
> Gluster can be great for targeting availability and distribution at low software cost, and I would say as of today at the expense of performance, but as with any scale-out NAS there are limitations and some surprises along the path.
> 
> Good hunting,
> -t
> 
>> On Jul 4, 2016, at 10:44 AM, Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> wrote:
>> 
>> 2016-07-04 19:35 GMT+02:00 Russell Purinton <russell.purinton at gmail.com>:
>>> For 3 servers with 12 disks each, I would do Hardware RAID0 (or madam if you don’t have a RAID card) of 3 disks.  So four 3-disk RAID0’s per server.
>> 
>> 3 servers is just to start. We plan to use 5 server in shorter time
>> and up to 15 on production.
>> 
>>> I would set them up as Replica 3 Arbiter 1
>>> 
>>> server1:/brickA server2:/brickC server3:/brickA
>>> server1:/brickB server2:/brickD server3:/brickB
>>> server2:/brickA server3:/brickC server1:/brickA
>>> server2:/brickB server3:/brickD server1:/brickB
>>> server3:/brickA server1:/brickC server2:/brickA
>>> server3:/brickB server1:/brickD server2:/brickB
>>> 
>>> The benefit of this is that you can lose an entire server node (12 disks) and all of your data is still accessible.   And you get the same space as if they were all in a RAID10.
>>> 
>>> If you lose any disk, the entire 3 disk brick will need to be healed from the replica.   I have 20GbE on each server so it doesn’t take long.   It copied 20TB in about 18 hours once.
>> 
>> So, any disk failure would me at least 6TB to be recovered via
>> network. This mean an high network utilization and as long gluster
>> doesn't have a dedicated network for replica,
>> this can slow down client access.
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users