[Gluster-devel] bad file access (bit-rot + AFR)

Raghavendra Bhat rabhat at redhat.com
Sat Jun 27 09:02:50 UTC 2015


Hi,

There is a patch that is submitted for review to deny access to objects 
which are marked as bad by scrubber (i.e. the data of the object might 
have been corrupted in the backend).

http://review.gluster.org/#/c/11126/10
http://review.gluster.org/#/c/11389/4

The above  2 patch sets solve the problem of denying access to the bad 
objects (they have passed regression and received a +1 from venky). But 
in our testing we found that there is a race window (depending upon the 
scrubber frequency the race window can be larger) where there is a 
possibility of self-heal daemon healing the contents of the bad file 
before scrubber can mark it as bad.

I am not sure if the data truly gets corrupted in the backend, there is 
a chance of hitting this issue. But in our testing to simulate backend 
corruption we modify the contents of the file directly in the backend. 
Now in this case, before the scrubber can mark the object as bad, the 
self-heal daemon kicks in and heals the contents of the bad file to the 
good copy. Or before the scrubber marks the file as bad, if the client 
accesses it AFR finds that there is a mismatch in metadata (since we 
modified the contents of the file in the backend) and does data and 
metadata self-healing, thus copying the contents of the bad copy to good 
copy. And from now onwards the clients accessing that object always gets 
bad data.

Pranith?Do you have any solution for this? Venky and me are trying to 
come up with a solution for this.

But does this issue block the above patches in anyway? (Those 2 patches 
are still needed to deny access to objects once they are marked as bad 
by scrubber).


Regards,
Raghavendra Bhat


More information about the Gluster-devel mailing list