[Gluster-devel] Why vandermonde matrix is used in EC?

Xavier Hernandez xhernandez at datalab.es
Mon Nov 28 07:51:29 UTC 2016


On 11/27/2016 09:58 PM, 한우형 wrote:
> Hi,
>
> Thank you so much for the speedy reply, but I have some more questions.
>
> 1) I understand Non-systematic encoding/decoding doesn't alter
> performance when one or more bricks are down. but why systematic
> approach has service degradation?
> I think when parity part is down there's no performance degradation, and
> when not-parity part is down it needs to be encoded. but It is same with
> Non-systeamtic case.

If a systematic implementation does increase performance in a 
perceptible way, then a failure of one brick will give less performance 
to users. Even if that performance is the same that we currently have, 
it will be worse from the perspective of the users.

Note that there's no distinction between "data" bricks and "parity" 
bricks. Each file will use a different brick for its parity, so a 
failure of a brick will always cause trouble to some files. This would 
also allow a distribution of the read load among all available bricks.

Anyway, as I said in the other email, it's not so clear that a 
systematic implementation would really have an important improvement on 
performance.

>
> 2) In systematic approach, what kind of metatdata need to be checked?
> Can't we just try to read not-parity part?

If a brick is down, it's clear that we'll need to read from parity, but 
when the brick comes up again it can contain old data (data modified 
while it was down), so we cannot simply read from that brick. We need to 
verify in some way that the other bricks do not contain updated data.

Best regards,

Xavi

>
>
> Best regards,
> Han
>
> 2016-11-24 17:26 GMT+09:00 Xavier Hernandez <xhernandez at datalab.es
> <mailto:xhernandez at datalab.es>>:
>
>     Hi Han,
>
>     On 11/24/2016 04:25 AM, 한우형 wrote:
>
>         Hi,
>
>         I'm working on dispersed volume(ec) and I found ec encode/decode
>         algorithm is using non-systematic vandermonde matrix.
>
>         My question is this: why non-systematic algorithm is used?
>
>
>     Non-systematic encoding/decoding doesn't alter performance when one
>     or more bricks are down. This means that you won't have service
>     degradation when you are having troubles with one brick or you are
>     doing maintenance.
>
>     From the implementation perspective, a systematic approach would
>     need to talk to all bricks anyway to check for critical metadata
>     (gluster doesn't have a centralized metadata server). This means
>     that the theoretical benefit of a systematic decoding for reads
>     would be masked by the overhead needed for metadata operations
>     (involving additional network round-trips).
>
>     That said, it's true that a systematic approach would have some
>     benefits, like a little less CPU overhead. Not sure if the
>     performance would benefit significantly though.
>
>         If we use
>         systematic algorithm(not systematic vandermonde, It's not MDS)
>
>
>     A non-systematic Vandermonde matrix *IS* MDS. In fact, pure
>     Vandermonde matrices are non-systematic by definition. Some
>     alterations need to be done to make them systematic, and these
>     transformations can lead to a non MDS matrix if not made with care.
>
>         we can
>         boost read performance. (no need to decode step in read)
>
>
>     Though it would probably have some benefits, I'm not so sure that
>     performance would improve significantly.
>
>     Current implementation of ec decoding can process 1GB/s of data per
>     CPU core on low end processors (Intel Atoms with SSE2) using block
>     sizes of 128KB and a 4+2 configuration. Currently this is much
>     faster than what a pure distributed volume on same hardware can read
>     for a single client/single thread.
>
>     So, for now, the non-systematic approach doesn't seem a bottle-neck
>     for gluster. Anyway there are plans to provide a systematic version,
>     but it's not a priority as of now.
>
>     Best regards,
>
>     Xavi
>
>
>         Best regards,
>         Han
>
>
>
>         _______________________________________________
>         Gluster-devel mailing list
>         Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>         http://www.gluster.org/mailman/listinfo/gluster-devel
>         <http://www.gluster.org/mailman/listinfo/gluster-devel>
>
>
>



More information about the Gluster-devel mailing list