[Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!!
Dan Lambright
dlambrig at redhat.com
Thu Dec 11 10:43:24 UTC 2014
Looks good to me.
Thank you
----- Original Message -----
> From: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> To: "Venky Shankar" <vshankar at redhat.com>
> Cc: "Joseph Fernandes" <josferna at redhat.com>, "Gluster Devel" <gluster-devel at gluster.org>, "Vijay Bellur"
> <vbellur at redhat.com>, "Dan Lambright" <dlambrig at redhat.com>, "Nagaprasad Sathyanarayana" <nsathyan at redhat.com>,
> "Vivek Agarwal" <vagarwal at redhat.com>
> Sent: Thursday, December 11, 2014 3:44:18 PM
> Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!!
>
> Hi All,
>
> As per the discussions within Data Tiering, BitRot and Geo-Rep team,
> following things are discussed.
>
> 1. For Data Tiering to use changelog, in memory LRU/LFU implementation is
> required to capture reads
> as changelog journal doesn't capture reads. But given the commitments each
> team has on for themselves,
> it might not be possible to implement in memory LRU/LFU implementation by
> 3.7 time line.
> As per current testing done by Tiering team, feeding Database in I/O path
> is not hitting noticeable
> performance as crash consistency is not expected. Hence for 3.7, logic for
> feeding database will be
> in changelog translator or new crt translator. When LRU/LFU implementation
> is available down the line,
> database can be fed from LRU/LFU in only changelog translator.
>
> 2. Since BitRot or any other consumers might decide to use database, query
> and initialization APIs are
> exposed as a library.
>
> Please add if I have missed anything or any corrections.
>
> Thanks and Regards,
> Kotresh H R
>
> ----- Original Message -----
> From: "Venky Shankar" <yknev.shankar at gmail.com>
> To: "Joseph Fernandes" <josferna at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Kotresh Hiremath
> Ravishankar" <khiremat at redhat.com>, "Vijay Bellur" <vbellur at redhat.com>,
> "Dan Lambright" <dlambrig at redhat.com>, "Ben England" <bengland at redhat.com>,
> "Ric Wheeler" <rwheeler at redhat.com>, "Nagaprasad Sathyanarayana"
> <nsathyan at redhat.com>, "Vivek Agarwal" <vagarwal at redhat.com>
> Sent: Saturday, December 6, 2014 1:53:16 PM
> Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due
> to bitrot and tiering changes!!!!
>
> [snip]
> >
> > Well If you would recall the multiple internal discussion we had and we had
> > agreed upon on this long time from the beginning.(though not recorded)
>
> Agreed. In that case changelog changes to feed an alternate data store
> is unneeded, correct?
>
> > and as a result of the discussion we have the Approach for the
> > infra-structure https://gist.github.com/vshankar/346843ea529f3af35339
> > AFAIK, Though the doc doesn't speak of the above in details it was always
> > the plan to do it as above.
>
> Absolutely, the document tries to solve things in a more generic way
> and does not cover data store feeding from the cache. Thinking about
> it more leads me to the point of feeding the data store at the time of
> cache expiry a neat approach.
>
> > The use of the LRU/LFU is definitely the way to go both with or without
> > changelog recording as it boasts the performance for recording.
> > And the mention of this is in
> > https://gist.github.com/vshankar/346843ea529f3af35339 at the end. Well you
> > know the best as you are the author :)
> > (Kotresh and me contributed over discussions, though not recorded, thanks
> > for mentioning it in the gluster-devel mail :) )
>
> Correct me here: if data store is fed from cache (on expiry), is the
> alternate feed from changelog (either inline or asynchronous to the
> data path) needed?
>
> >
> > As I have mentioned the development of feeding the DB in the IO path is
> > still in work in progress. We (Dan & Me) are making it more and more
> > performant. We have
> > also taking guidance from Ben England on testing it in parallel with
> > development cycles so that we have the best approach & implementation.
> > That is where we are getting the numbers from (This is recorded in mails I
> > will forward them to you). Plus we have kept Vijay Bellur in sync with the
> > approach we are taking on a weekly basis ( though not recorded :) )
>
> That's nice. But, my previous comment is still a concern.
>
> >
> > On the point of the discussion not recorded on gluster-devel, these
> > discussion happened more frequently and in more adhoc way. Well you the
> > best as you were part of all of them :).
>
> Hmmm, not all.
>
> >
> > As we move forward we will have more discussion internally for sure and
> > lets make sure that they are recorded so that lets not keep running
> > around the same bush again and again ;).
> >
> > And Thanks for all the help in form of discussion/thoughts. Looking forward
> > for more as we along.
>
> Anytime.
>
> >
> > ~Joe
> >
> >
> > Venky
> >
> >>
> >> ~ Joseph ( NOT Josef :) )
> >>
> >> ----- Original Message -----
> >> From: "Venky Shankar" <yknev.shankar at gmail.com>
> >> To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> >> Cc: "Gluster Devel" <gluster-devel at gluster.org>, dlambrig at redhat.com,
> >> josferna at redhat.com, "Vijay Bellur" <vbellur at redhat.com>
> >> Sent: Thursday, December 4, 2014 8:53:43 PM
> >> Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication
> >> due to bitrot and tiering changes!!!
> >>
> >> [Adding Dan/Josef/Vijay]
> >>
> >> As of now, "rollover-time" is global to changelog translator, hence
> >> tuning that would effect all consumers subscribing to updates. It's 15
> >> seconds by default and has proved to provide a good balance between
> >> replication performance (geo-rep) and IOPS rate. Tuning to a lower
> >> value would imply doing a round of perf test for geo-rep to be safe.
> >>
> >> The question is if data tiering can compromise on data freshness. If
> >> yes, is there a hard limit? For BitRot, it should be OK as the policy
> >> for checksum calculation is lazy. Adding a bit more lag would not hurt
> >> much.
> >>
> >> Josef,
> >>
> >> Could you share the performance numbers along with the setup
> >> (configuration, etc.) you used to measure SQLite performance inline to
> >> the data path?
> >>
> >> -Venky
> >>
> >> On Thu, Dec 4, 2014 at 3:23 PM, Kotresh Hiremath Ravishankar
> >> <khiremat at redhat.com> wrote:
> >>> Hi,
> >>>
> >>> As of now, geo-replication is the only consumer of the changelog.
> >>> Going forward bitrot and tiering also will join as consumers.
> >>> The current format of the changelog can be found in below links.
> >>>
> >>> http://www.gluster.org/community/documentation/index.php/Arch/Change_Logging_Translator_Design
> >>> https://github.com/gluster/glusterfs/blob/master/doc/features/geo-replication/libgfchangelog.md
> >>>
> >>>
> >>> Current Design:
> >>>
> >>> 1. Every changelog.rollover-time secs (configurable), a new changelog
> >>> file is generated:
> >>>
> >>> 2. Geo-replication history API, designed as part of Snapshot requirement,
> >>> maintains
> >>> a HTIME file with changelog filenames generated. It is guaranteed that
> >>> there is
> >>> no breakage between all the changelogs within one HTIME file i.e.,
> >>> changelog is not
> >>> enabled/disabled in between.
> >>>
> >>> Proposed changes for changelog as part of bitrot and tiering:
> >>>
> >>> 1. Add timestamp for each fop record in changelog.
> >>>
> >>> Rational : Tiering requires timestamp of each fop.
> >>> Implication on Geo-rep: NO
> >>>
> >>>
> >>> 2. Make one big changelog per day or so and do not rollover the changelog
> >>> every rollover-time.
> >>>
> >>> Rational: Changing changelog.rollover-time is gonna affect all the
> >>> three consumers hence
> >>> decoupling is required.
> >>>
> >>> Geo-replication: Is fine with changing rollover time.
> >>> Tiering : Not fine as per the input I got from
> >>> Joseph (Joseph, please comment).
> >>> as this adds up to the delay that
> >>> tiering gets the change
> >>> notification from changelog.
> >>> Bitrot : It should be fine. (Venky, please
> >>> comment).
> >>>
> >>> Implications on current Geo-replication Design:
> >>>
> >>> 1. Breaks History API: Needs redesign.
> >>> 2. Changes to geo-replication changelog consumption logic ??
> >>> 3. libgfchangelog API changes.
> >>> 4. Effort to handle upgrade scenarios.
> >>>
> >>> Bitrot and Tiering guys, Please add any more changes expected which I
> >>> have missed.
> >>>
> >>> Point to discuss, considering the implications on geo-replication, are
> >>> there any other
> >>> approaches with which we can solve this problem without much implication
> >>> to current
> >>> geo-replication logic??
> >>>
> >>>
> >>> Thanks and Regards,
> >>> Kotresh H R
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Gluster-devel mailing list
> >>> Gluster-devel at gluster.org
> >>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list