[Gluster-devel] BitRot notes

Fri Nov 28 16:30:13 UTC 2014

On 11/28/2014 08:30 AM, Venky Shankar wrote:
> [snip]
>>
>> 1. Can the bitd be one per node like self-heal-daemon and other "global"
>> services? I worry about creating 2 * N processes for N bricks in a node.
>> Maybe we can consider having one thread per volume/brick etc. in a single
>> bitd process to make it perform better.
>
> Absolutely.
> There would be one bitrot daemon per node, per volume.
>

Do you foresee any problems in having one daemon per node for all volumes?

>
>>
>> 3. I think the algorithm for checksum computation can vary within the
>> volume. I see a reference to "Hashtype is persisted along side the checksum
>> and can be tuned per file type." Is this correct? If so:
>>
>> a) How will the policy be exposed to the user?
>
> Bitrot daemon would have a configuration file that can be configured
> via Gluster CLI. Tuning hash types could be based on file types or
> file name patterns (regexes) [which is a bit tricky as bitrot would
> work on GFIDs rather than filenames, but this can be solved by a level
> of indirection].
>
>>
>> b) It would be nice to have the algorithm for computing checksums be
>> pluggable. Are there any thoughts on pluggability?
>
> Do you mean the default hash algorithm be configurable? If yes, then
> that's planned.

Sounds good.

>
>>
>> c) What are the steps involved in changing the hashtype/algorithm for a
>> file?
>
> Policy changes for file {types, patterns} are lazy, i.e., taken into
> effect during the next recompute. For objects that are never modified
> (after initial checksum compute), scrubbing can recompute the checksum
> using the new hash _after_ verifying the integrity of a file with the
> old hash.

>
>>
>> 4. Is the fop on which change detection gets triggered configurable?
>
> As of now all data modification fops trigger checksum calculation.
>

Wish I was more clear on this in my OP. Is the fop on which checksum 
verification/bitrot detection happens configurable? The feature page 
talks about "open" being a trigger point for this. Users might want to 
trigger detection on a "read" operation and not on open. It would be 
good to provide this flexibility.

>
>>
>> 6. Any thoughts on integrating the bitrot repair framework with self-heal?
>
> There are some thoughts on integration with self-heal daemon and EC.
> I'm coming up with a doc which covers those [reason for delay in
> replying to your questions ;)]. Expect the doc in in gluster-devel@
> soon.

Will look forward to this.

>
>>
>> 7. How does detection figure out that lazy updation is still pending and not
>> raise a false positive?
>
> That's one of the things that myself and Rachana discussed yesterday.
> Should scrubbing *wait* till checksum updating is still in progress or
> is it expected that scrubbing happens when there is no active I/O
> operations on the volume (both of which imply that bitrot daemon needs
> to know when it's done it's job).
>
> If both scrub and checksum updating go in parallel, then there needs
> to be way to synchronize those operations. Maybe, compute checksum on
> priority which is provided by the scrub process as a hint (that leaves
> little window for rot though) ?
>
> Any thoughts?

Waiting for no active I/O in the volume might be a difficult condition 
to reach in some deployments.

Some form of waiting is necessary to prevent false positives. One 
possibility might be to mark an object as dirty till checksum updation 
is complete. Verification/scrub can then be skipped for dirty objects.

-Vijay