[Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!

Venky Shankar yknev.shankar at gmail.com
Thu Dec 4 15:23:43 UTC 2014


[Adding Dan/Josef/Vijay]

As of now, "rollover-time" is global to changelog translator, hence
tuning that would effect all consumers subscribing to updates. It's 15
seconds by default and has proved to provide a good balance between
replication performance (geo-rep) and IOPS rate. Tuning to a lower
value would imply doing a round of perf test for geo-rep to be safe.

The question is if data tiering can compromise on data freshness. If
yes, is there a hard limit? For BitRot, it should be OK as the policy
for checksum calculation is lazy. Adding a bit more lag would not hurt
much.

Josef,

Could you share the performance numbers along with the setup
(configuration, etc.) you used to measure SQLite performance inline to
the data path?

-Venky

On Thu, Dec 4, 2014 at 3:23 PM, Kotresh Hiremath Ravishankar
<khiremat at redhat.com> wrote:
> Hi,
>
> As of now, geo-replication is the only consumer of the changelog.
> Going forward bitrot and tiering also will join as consumers.
> The current format of the changelog can be found in below links.
>
> http://www.gluster.org/community/documentation/index.php/Arch/Change_Logging_Translator_Design
> https://github.com/gluster/glusterfs/blob/master/doc/features/geo-replication/libgfchangelog.md
>
>
> Current Design:
>
> 1. Every changelog.rollover-time secs (configurable), a new changelog file is generated:
>
> 2. Geo-replication history API, designed as part of Snapshot requirement, maintains
>    a HTIME file with changelog filenames generated. It is guaranteed that there is
>    no breakage between all the changelogs within one HTIME file i.e., changelog is not
>    enabled/disabled in between.
>
> Proposed changes for changelog as part of bitrot and tiering:
>
> 1. Add timestamp for each fop record in changelog.
>
>    Rational              : Tiering requires timestamp of each fop.
>    Implication on Geo-rep: NO
>
>
> 2. Make one big changelog per day or so and do not rollover the changelog every rollover-time.
>
>    Rational: Changing changelog.rollover-time is gonna affect all the three consumers hence
>              decoupling is required.
>
>                 Geo-replication: Is fine with changing rollover time.
>                 Tiering        : Not fine as per the input I got from Joseph (Joseph, please comment).
>                                  as this adds up to the delay that tiering gets the change
>                                  notification from changelog.
>                 Bitrot         : It should be fine. (Venky, please comment).
>
>    Implications on current Geo-replication Design:
>
>              1. Breaks History API: Needs redesign.
>              2. Changes to geo-replication changelog consumption logic ??
>              3. libgfchangelog API changes.
>              4. Effort to handle upgrade scenarios.
>
> Bitrot and Tiering guys, Please add any more changes expected which I have missed.
>
> Point to discuss, considering the implications on geo-replication, are there any other
> approaches with which we can solve this problem without much implication to current
> geo-replication logic??
>
>
> Thanks and Regards,
> Kotresh H R
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list