[Gluster-devel] BitRot notes

Fri Nov 7 15:20:35 UTC 2014

Thanks Xavi.

----- Original Message -----
From: "Xavier Hernandez" <xhernandez at datalab.es>
To: "Alok Srivastava" <asrivast at redhat.com>, "Venky Shankar" <vshankar at redhat.com>
Cc: gluster-devel at gluster.org, "Ric Wheeler" <rwheeler at redhat.com>
Sent: Thursday, November 6, 2014 9:15:52 PM
Subject: Re: [Gluster-devel] BitRot notes

Hi Alok,

On 11/06/2014 02:53 PM, Alok Srivastava wrote:
> Thanks Venky for sharing the details.
> I have included Pranith and Atin for a specific question:
>
> With current implementation of erasure coding, Do we  have the capability of detecting and correcting bit rot?

Current implementation of erasure coding xlator does not detect bit rot. 
It can repair data if it knows that it's damaged. I'm considering a 
change in the implementation that will allow bit rot detection and 
optimize self-healing by not rewriting the entire file when a local 
error is detected.

>
> Let's say we have fragments of a file on multiple bricks and a read request is sent to the bricks, a brick  doesn't respond as it may have disk(s) damaged by bit rot or any other failure. Will EC detect and correct bit rot in this case?

This particular case is currently handled by ec xlator because it won't 
receive data from a particular brick, so it will know that its data is 
damaged and will use fragments from other bricks to recover the original 
data. The user will receive the correct data recovered from other bricks 
and, if the damaged brick is still accessible, it will try to regenerate 
the fragment on that brick.

Xavi