[Gluster-devel] healing of bad objects (marked by scrubber)
rabhat at redhat.com
Wed Jul 8 06:12:41 UTC 2015
Adding the correct gluster-devel id.
On 07/08/2015 11:38 AM, Raghavendra Bhat wrote:
> In bit-rot feature, the scrubber marks the corrupted (objects whose
> data has gone bad) as bad objects (via extended attribute). If the
> volume is a replicate volume and a object in one of the replicas goes
> bad. In this case, the client is able to see the data via the good
> copy present in the other replica. But as of now, the self-heal does
> not heal the bad objects. So the method to heal the bad object is to
> remove the bad object directly from the backend and let self-heal take
> care of healing it from the good copy.
> The above method has a problem. The bit-rot-stub xlator sitting in the
> brick graph, remembers an object as bad in its inode context (either
> when the object was being marked bad by scrubber, or during the first
> lookup of the object if it was already marked bad). Bit-rot-stub uses
> that info to block any read/write operations on such bad objects. So
> it blocks any kind of operation attempted by self-heal as well to
> correct the object (the object was deleted directly in the backend, so
> the in memory inode will still be present and considered valid).
> There are 2 methods that I think can solve the issue.
> 1) In server_lookup_cbk, if the lookup of a object fails due to
> ENOENT *AND* the lookup is a revalidate lookup, then forget the
> inode associated with that object (not just unlinking the dentry,
> forget the inode as well iff there are no more dentries associated
> with the inode). Atleast this way, the inode would be forgotten, and
> later when self-heal wants to correct the object, it has to create a
> new object (the object was removed directly from the backend), which
> has to happen with the creation of a new in memory inode and
> read/write operations by self-heal daemon will not be blocked.
> I have sent a patch for review for the above method:
> 2) Do not block write operations coming on the bad object if the
> operation is coming from self-heal and allow it to completely heal the
> file and once healing is done, remove the bad-object information from
> the inode context.
> The requests coming from self-heal demon can be identified by checking
> the pid of it (it has -ve pid). But if the self-heal happening from
> the glusterfs client itself, I am not sure whether self-heal happens
> with a -ve pid for the frame or the same pid as that of the frame of
> the original fop which triggered the self-heal. Pranith? Can you
> clarify this?
> Please provide feedback.
> Raghavendra Bhat
More information about the Gluster-devel