[Gluster-devel] BitRot notes

Fri Nov 28 03:00:47 UTC 2014

[snip]
>
> 1. Can the bitd be one per node like self-heal-daemon and other "global"
> services? I worry about creating 2 * N processes for N bricks in a node.
> Maybe we can consider having one thread per volume/brick etc. in a single
> bitd process to make it perform better.

Absolutely.
There would be one bitrot daemon per node, per volume.

>
> 2. It would be good to consider throttling for filesystem scan and update of
> checksums. That way we can avoid overwhelming the system after enabling
> bitrot on pre-created data.

Makes sense. Filesystem scan based on xtime is planned to be
integrated in libgfchangelog and exposed via an API. Throttling would
be one of the tunable to control scan speed.

>
> 3. I think the algorithm for checksum computation can vary within the
> volume. I see a reference to "Hashtype is persisted along side the checksum
> and can be tuned per file type." Is this correct? If so:
>
> a) How will the policy be exposed to the user?

Bitrot daemon would have a configuration file that can be configured
via Gluster CLI. Tuning hash types could be based on file types or
file name patterns (regexes) [which is a bit tricky as bitrot would
work on GFIDs rather than filenames, but this can be solved by a level
of indirection].

>
> b) It would be nice to have the algorithm for computing checksums be
> pluggable. Are there any thoughts on pluggability?

Do you mean the default hash algorithm be configurable? If yes, then
that's planned.

>
> c) What are the steps involved in changing the hashtype/algorithm for a
> file?

Policy changes for file {types, patterns} are lazy, i.e., taken into
effect during the next recompute. For objects that are never modified
(after initial checksum compute), scrubbing can recompute the checksum
using the new hash _after_ verifying the integrity of a file with the
old hash.

>
> 4. Is the fop on which change detection gets triggered configurable?

As of now all data modification fops trigger checksum calculation.

>
> 5. It would be good to have the store & retrieval of checksums modular so
> that we can choose an alternate backend in the future (apart from extended
> attributes) if necessary.

Yes. That too would be pluggable with xattr based store as the
default. store/retrieve apis would be generic enough for pluggability.

>
> 6. Any thoughts on integrating the bitrot repair framework with self-heal?

There are some thoughts on integration with self-heal daemon and EC.
I'm coming up with a doc which covers those [reason for delay in
replying to your questions ;)]. Expect the doc in in gluster-devel@
soon.

>
> 7. How does detection figure out that lazy updation is still pending and not
> raise a false positive?

That's one of the things that myself and Rachana discussed yesterday.
Should scrubbing *wait* till checksum updating is still in progress or
is it expected that scrubbing happens when there is no active I/O
operations on the volume (both of which imply that bitrot daemon needs
to know when it's done it's job).

If both scrub and checksum updating go in parallel, then there needs
to be way to synchronize those operations. Maybe, compute checksum on
priority which is provided by the scrub process as a hint (that leaves
little window for rot though) ?

Any thoughts?

>
> Regards,
> Vijay
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel