[Gluster-users] Need help to design a data storage

Ashish Pandey aspandey at redhat.com
Tue Aug 9 17:20:26 UTC 2016


Yes Gandalf, I think you are missing a point, the way we configure EC. 
To explain that I would like to take less number of disks. Lets say you have 6 disk of 1TB each on 6 different nodes. 

1- Replica 2 using gluster 
There will be 3 sub volume of replica - afr-1, afr-2, afr-3 each with pair of 2 disk. 
A file name file.txt will be saved on 2 disks of any one sub volume. That means you are cutting the storage space to half - 3TB 
Also at any point of time you can afford to kill only 1 brick. 

2- Replica 3 using gluster 
There will be 2 sub volume of replica - afr-1, afr-2, each with 3 disks. 
A file name file.txt will be saved on 3 disks of any one sub volume lets say afr-1. That means you are cutting the storage space to 1/3rd - 2TB 
Also at any point of time you can afford to kill only 2 bricks of afr-1. 

3 - EC with redundancy 2 that is 4+2 
The over all storage space you get is 4TB and any 2 bricks can be down at any point of time. So it is as good as replica 3 but providing more space. 

Now when you give example of 108 disks. You should not have 106+2 configuration of EC as you were saying. That is really very poor setup and you are right about redundancy. 
But when you say that you will create replica 3 that means you will have 108/3 = 36 sub volume each with on 1TB storage space. So you will get 36TB of storage if each brick is of 1TB. 

So I would say that you should create EC with configuration of 4+2. There will be 108/6 = 18 sub volumes each with 4TB of capacity. Total space you get is 18 X 4 = 72TB of data. 

In both the above cases even if you kill 2 bricks from the same volume, data will be served. If you kill 3rd brick from the same sub volume you will loose data in replica as well as in EC sub volume. 

---- 
Ashish 





----- Original Message -----

From: "Gandalf Corvotempesta" <gandalf.corvotempesta at gmail.com> 
To: "Ashish Pandey" <aspandey at redhat.com> 
Cc: gluster-users at gluster.org 
Sent: Tuesday, August 9, 2016 8:33:31 PM 
Subject: Re: [Gluster-users] Need help to design a data storage 



Il 09 ago 2016 10:06 AM, "Ashish Pandey" < aspandey at redhat.com > ha scritto: 
> If your main concern is data redundancy, I would suggest you to go for erasure coded volume provided by gluster. 

Anyway EC volumes has a lower redundancy level than standard replicated volumes. 

Let's assume a 9 nodes cluster with 12 disks on each node, redundancy set to 2 

You have 9*12 = 108 disks/bricks 
with redundancy 2 you can loose up to 2 bricks/disks at the same time before loosing data. Using cheap sata disks (gluster is made to run on commodity hardware) loosing 3 disks over 108 in a very short time could happen frequently and this frequency grow as cluster grows 

With a standard replicated volume, with replica 3, you can loose up to 3 servers (not bricks) because each brick in a replica set must be on a different server. 

I think EC is something like raid6 (with more "parity") and standard replication is like raid10 but with 3 disks for each mirror. 

Raid10 is safer as you can loose as many disks as you want, if in different replica set, while raid 6 can loose up to 2 disks in the whole cluster 
Higher the number of disks, higher the probability of data loss with raid6/EC 

Am i missed something? 
_______________________________________________ 
Gluster-users mailing list 
Gluster-users at gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160809/6419e77f/attachment.html>


More information about the Gluster-users mailing list