[Gluster-users] Need some clarifications about the disperse feature

Wed Nov 26 08:35:03 UTC 2014

Thank you Xavi, it's very helpful (also to Atin).

Have you had any benchmarks of how much penalty in performance I should
expect for an intense reading using this feature? Naturaly I will test in
my specific environment, just want to know if there are any benchmarks I
can see for now.

Ayelet

On Tue, Nov 25, 2014 at 5:19 PM, Xavier Hernandez <xhernandez at datalab.es>
wrote:

> Hi Ayelet,
>
> On 11/25/2014 02:41 PM, Ayelet Shemesh wrote:
>
>> Hello Gluster experts,
>>
>> I have been using gluster for a small cluster for a few years now and I
>> have a question regarding the new disperse feature, which is for me a
>> much anticipated addition.
>>
>> *Suppose* I create a volume with a disperse set of 3, redundancy 1
>> (let's call them A1, A2, A3) and then I add 3 more bricks to that volume
>> (we'll call them B1, B2, B3).
>>
>> *First question* - which of the bricks will be the one carrying the
>> redundancy data?
>>
>
> In current implementation, there's no difference between data and
> redundancy. All bricks behave exactly equal and there isn't anyone more
> important than another. In a configuration with 3 bricks and redundancy 1,
> you can lose any brick and everything will continue working normally.
>
>
>> *Second question* - If I have machines with faster disk - should I
>> assign them to the data or the redundancy bricks? What should I expect
>> the load to be on the redundancy machine in heavy read scenarios and in
>> heavy write scenarios?
>>
>
> As I said, there isn't a dedicated redundancy brick, so there's no benefit
> in assigning the fast disk to a specific brick.
>
> Read requests only need to be processed on N - R bricks (N = total number
> of bricks, R = redundancy). This means that in your configuration, each
> read will be sent to 2 bricks. If all bricks are alive and healthy, the
> disperse translator balances these reads among all nodes, giving 2/3 of the
> load to each brick.
>
> Write requests are processed by all bricks, so the load is the same on all
> of them.
>
>
>> *Third question* - _does this require reading the entire data_ of A1, A2
>> and A3 by initiating a heal or another operation?
>>
>>
> Healing operations are on file basis. If only some files of A3 have been
> damaged, it will only read the corresponding data from A1 and A2, but not
> the entire contents of A1 and A2. To heal a file, all file contents are
> read.
>
>  *4th question* (and most important for me) - I saw in the list that it
>> is now a Distributed-Dispersed volume. I understand I can now lose, for
>> example bricks A1 and B1 and still have my entire data intact.
>>
>
> Correct
>
>  Is this also correct for bricks from the same set, for example A1 and A2?
>>
>
> No, each disperse set is independent and have the same redundancy. It's
> equivalent to a distributed replicated: if you lose both bricks of the same
> replica set, you will lose access to the data stored in that replica set.
>
>  Or to put it in a more generic way - _does this create the exact same
>> dispersed volume as if I created it originally with A1, A2 A3 B1 B2 B3
>> and a redundancy of 2?
>>
>
> No. These are two different configurations. Both have the same effective
> capacity, but the probability of failure in the second case is several
> times lower than the first one (you can lose *any* two bricks without
> losing access to the data). However it's more expensive to grow the volume
> because you will need to add 6 new bricks at the same time, while with the
> first case you only need to add 3.
>
> Xavi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141126/f08fd8eb/attachment.html>