[Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!!

Dan Lambright dlambrig at redhat.com
Thu Dec 11 10:43:24 UTC 2014


Looks good to me.
Thank you

----- Original Message -----
> From: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> To: "Venky Shankar" <vshankar at redhat.com>
> Cc: "Joseph Fernandes" <josferna at redhat.com>, "Gluster Devel" <gluster-devel at gluster.org>, "Vijay Bellur"
> <vbellur at redhat.com>, "Dan Lambright" <dlambrig at redhat.com>, "Nagaprasad Sathyanarayana" <nsathyan at redhat.com>,
> "Vivek Agarwal" <vagarwal at redhat.com>
> Sent: Thursday, December 11, 2014 3:44:18 PM
> Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!!
> 
> Hi All,
> 
> As per the discussions within Data Tiering, BitRot and Geo-Rep team,
> following things are discussed.
> 
> 1. For Data Tiering to use changelog, in memory LRU/LFU implementation is
> required to capture reads
>    as changelog journal doesn't capture reads. But given the commitments each
>    team has on for themselves,
>    it might not be possible to implement in memory LRU/LFU implementation by
>    3.7 time line.
>    As per current testing done by Tiering team, feeding Database in I/O path
>    is not hitting noticeable
>    performance as crash consistency is not expected. Hence for 3.7, logic for
>    feeding database will be
>    in changelog translator or new crt translator. When LRU/LFU implementation
>    is available down the line,
>    database can be fed from LRU/LFU in only changelog translator.
> 
> 2. Since BitRot or any other consumers might decide to use database, query
> and initialization APIs are
>    exposed as a library.
> 
> Please add if I have missed anything or any corrections.
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> From: "Venky Shankar" <yknev.shankar at gmail.com>
> To: "Joseph Fernandes" <josferna at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Kotresh Hiremath
> Ravishankar" <khiremat at redhat.com>, "Vijay Bellur" <vbellur at redhat.com>,
> "Dan Lambright" <dlambrig at redhat.com>, "Ben England" <bengland at redhat.com>,
> "Ric Wheeler" <rwheeler at redhat.com>, "Nagaprasad Sathyanarayana"
> <nsathyan at redhat.com>, "Vivek Agarwal" <vagarwal at redhat.com>
> Sent: Saturday, December 6, 2014 1:53:16 PM
> Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due
> to bitrot and tiering changes!!!!
> 
> [snip]
> >
> > Well If you would recall the multiple internal discussion we had and we had
> > agreed upon on this long time from the beginning.(though not recorded)
> 
> Agreed. In that case changelog changes to feed an alternate data store
> is unneeded, correct?
> 
> > and as a result of the discussion we have the Approach for the
> > infra-structure https://gist.github.com/vshankar/346843ea529f3af35339
> > AFAIK, Though the doc doesn't speak of the above in details it was always
> > the plan to do it as above.
> 
> Absolutely, the document tries to solve things in a more generic way
> and does not cover data store feeding from the cache. Thinking about
> it more leads me to the point of feeding the data store at the time of
> cache expiry a neat approach.
> 
> > The use of the LRU/LFU is definitely the way to go both with or without
> > changelog recording as it boasts the performance for recording.
> > And the mention of this is in
> > https://gist.github.com/vshankar/346843ea529f3af35339 at the end. Well you
> > know the best as you are the author :)
> > (Kotresh and me contributed over discussions, though not recorded, thanks
> > for mentioning it in the gluster-devel mail :) )
> 
> Correct me here: if data store is fed from cache (on expiry), is the
> alternate feed from changelog (either inline or asynchronous to the
> data path) needed?
> 
> >
> > As I have mentioned the development of feeding the DB in the IO path is
> > still in work in progress. We (Dan & Me) are making it more and more
> > performant. We have
> > also taking guidance from Ben England on testing it in parallel with
> > development cycles so that we have the best approach &  implementation.
> > That is where we are getting the numbers from (This is recorded in mails I
> > will forward them to you). Plus we have kept Vijay Bellur in sync with the
> > approach we are taking on a weekly basis ( though not recorded :) )
> 
> That's nice. But, my previous comment is still a concern.
> 
> >
> > On the point of the discussion not recorded on gluster-devel, these
> > discussion happened more frequently and in more adhoc way. Well you the
> > best as you were part of all of them :).
> 
> Hmmm, not all.
> 
> >
> > As we move forward we will have more discussion internally for sure and
> > lets make sure that they are recorded so that lets not keep running
> > around the same bush again and again ;).
> >
> > And Thanks for all the help in form of discussion/thoughts. Looking forward
> > for more as we along.
> 
> Anytime.
> 
> >
> > ~Joe
> >
> >
> >     Venky
> >
> >>
> >> ~ Joseph ( NOT Josef :) )
> >>
> >> ----- Original Message -----
> >> From: "Venky Shankar" <yknev.shankar at gmail.com>
> >> To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> >> Cc: "Gluster Devel" <gluster-devel at gluster.org>, dlambrig at redhat.com,
> >> josferna at redhat.com, "Vijay Bellur" <vbellur at redhat.com>
> >> Sent: Thursday, December 4, 2014 8:53:43 PM
> >> Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication
> >> due to bitrot and tiering changes!!!
> >>
> >> [Adding Dan/Josef/Vijay]
> >>
> >> As of now, "rollover-time" is global to changelog translator, hence
> >> tuning that would effect all consumers subscribing to updates. It's 15
> >> seconds by default and has proved to provide a good balance between
> >> replication performance (geo-rep) and IOPS rate. Tuning to a lower
> >> value would imply doing a round of perf test for geo-rep to be safe.
> >>
> >> The question is if data tiering can compromise on data freshness. If
> >> yes, is there a hard limit? For BitRot, it should be OK as the policy
> >> for checksum calculation is lazy. Adding a bit more lag would not hurt
> >> much.
> >>
> >> Josef,
> >>
> >> Could you share the performance numbers along with the setup
> >> (configuration, etc.) you used to measure SQLite performance inline to
> >> the data path?
> >>
> >> -Venky
> >>
> >> On Thu, Dec 4, 2014 at 3:23 PM, Kotresh Hiremath Ravishankar
> >> <khiremat at redhat.com> wrote:
> >>> Hi,
> >>>
> >>> As of now, geo-replication is the only consumer of the changelog.
> >>> Going forward bitrot and tiering also will join as consumers.
> >>> The current format of the changelog can be found in below links.
> >>>
> >>> http://www.gluster.org/community/documentation/index.php/Arch/Change_Logging_Translator_Design
> >>> https://github.com/gluster/glusterfs/blob/master/doc/features/geo-replication/libgfchangelog.md
> >>>
> >>>
> >>> Current Design:
> >>>
> >>> 1. Every changelog.rollover-time secs (configurable), a new changelog
> >>> file is generated:
> >>>
> >>> 2. Geo-replication history API, designed as part of Snapshot requirement,
> >>> maintains
> >>>    a HTIME file with changelog filenames generated. It is guaranteed that
> >>>    there is
> >>>    no breakage between all the changelogs within one HTIME file i.e.,
> >>>    changelog is not
> >>>    enabled/disabled in between.
> >>>
> >>> Proposed changes for changelog as part of bitrot and tiering:
> >>>
> >>> 1. Add timestamp for each fop record in changelog.
> >>>
> >>>    Rational              : Tiering requires timestamp of each fop.
> >>>    Implication on Geo-rep: NO
> >>>
> >>>
> >>> 2. Make one big changelog per day or so and do not rollover the changelog
> >>> every rollover-time.
> >>>
> >>>    Rational: Changing changelog.rollover-time is gonna affect all the
> >>>    three consumers hence
> >>>              decoupling is required.
> >>>
> >>>                 Geo-replication: Is fine with changing rollover time.
> >>>                 Tiering        : Not fine as per the input I got from
> >>>                 Joseph (Joseph, please comment).
> >>>                                  as this adds up to the delay that
> >>>                                  tiering gets the change
> >>>                                  notification from changelog.
> >>>                 Bitrot         : It should be fine. (Venky, please
> >>>                 comment).
> >>>
> >>>    Implications on current Geo-replication Design:
> >>>
> >>>              1. Breaks History API: Needs redesign.
> >>>              2. Changes to geo-replication changelog consumption logic ??
> >>>              3. libgfchangelog API changes.
> >>>              4. Effort to handle upgrade scenarios.
> >>>
> >>> Bitrot and Tiering guys, Please add any more changes expected which I
> >>> have missed.
> >>>
> >>> Point to discuss, considering the implications on geo-replication, are
> >>> there any other
> >>> approaches with which we can solve this problem without much implication
> >>> to current
> >>> geo-replication logic??
> >>>
> >>>
> >>> Thanks and Regards,
> >>> Kotresh H R
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Gluster-devel mailing list
> >>> Gluster-devel at gluster.org
> >>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 


More information about the Gluster-devel mailing list