[Gluster-devel] BitRot notes
Vijay Bellur
vbellur at redhat.com
Mon Nov 24 10:49:23 UTC 2014
On 10/31/2014 04:09 PM, Venky Shankar wrote:
> Hey folks,
>
> Myself and Raghavendra (@rabhat) have been discussing about BitRot[1]
> and came up with a list of high level tasks (breakup items) captured
> here[2]. The pad will be updated on an ongoing basis reflecting the
> current status/items that are being worked on. As always, contributions
> in any form (design, code, doc, etc..) are more than welcome (just make
> sure you're heard on the pad/email :))
>
> [1]:
> http://www.gluster.org/community/documentation/index.php/Features/BitRot
> [2]: https://public.pad.fsfe.org/p/glusterfs-bitrot-notes
>
Thanks for this, Venky. This looks like a good start. A few questions
and thoughts:
1. Can the bitd be one per node like self-heal-daemon and other "global"
services? I worry about creating 2 * N processes for N bricks in a node.
Maybe we can consider having one thread per volume/brick etc. in a
single bitd process to make it perform better.
2. It would be good to consider throttling for filesystem scan and
update of checksums. That way we can avoid overwhelming the system after
enabling bitrot on pre-created data.
3. I think the algorithm for checksum computation can vary within the
volume. I see a reference to "Hashtype is persisted along side the
checksum and can be tuned per file type." Is this correct? If so:
a) How will the policy be exposed to the user?
b) It would be nice to have the algorithm for computing checksums be
pluggable. Are there any thoughts on pluggability?
c) What are the steps involved in changing the hashtype/algorithm for a
file?
4. Is the fop on which change detection gets triggered configurable?
5. It would be good to have the store & retrieval of checksums modular
so that we can choose an alternate backend in the future (apart from
extended attributes) if necessary.
6. Any thoughts on integrating the bitrot repair framework with self-heal?
7. How does detection figure out that lazy updation is still pending and
not raise a false positive?
Regards,
Vijay
More information about the Gluster-devel
mailing list