[Gluster-devel] Compressing CTR DB

Wed Apr 27 16:25:39 UTC 2016

Discussion till now.

~Joe

----- Forwarded Message -----
> From: "Joseph Fernandes" <josferna at redhat.com>
> To: "Dan Lambright" <dlambrig at redhat.com>
> Sent: Wednesday, April 27, 2016 9:24:57 PM
> Subject: Re: Compressing CTR DB
> 
> answers inline
> 
> ----- Original Message -----
> > From: "Dan Lambright" <dlambrig at redhat.com>
> > Sent: Wednesday, April 27, 2016 8:39:06 PM
> > Subject: Re: Compressing CTR DB
> > 
> > To compress the uuid, is it a lossless compression algorithm? So we won't
> > lose any bits?
> > 
> 
> We dont compress using any compression algos. Instead of saving the GFID
> (which is a uuid) as printable string which takes 33-36 bytes
> will save it as a 16 byte integer/blob, how it is represented in-memory
> 
> > If we do an upgrade and change the schema, could we delete the old db and
> > start a brand new one? An upgrade is a rare event?
> 
> In a upgrade scenario we can do the following untill GFID to PATH Conversion
> comes.
> 1. Decommission the old DB and start using new DB
> 2. The new DB will be healed in two ways,
>    a. Named lookup : Though the File heat table will be healed using any
>    other operation, the file link table (the one with multiple hardlinks)
>       will not be healed until and unless there is a nameslookup.
>    b. The Background heal from old to new via a separate thread in the brick.
>    YES there might be a performance hit, and this can be contained using
>       throttling mechanism.
> 
> Again the question of how often a user upgrades ? might be its a rare event,
> but stability shouldnt be affected.
> 
> As discussed in the scrum lets speak to Aravinda and Shyam about this issue
> of GFID to PATH Conversion next week, there is a proposal, but nothing
> implemented and
> functional as we have in DB. But yes we need to move it out of the DB as its
> not why we got the DB.
> 
> > 
> > Agree we need a version # as part of the solution.
> 
> YES we will have a version of schema in the DB itself.
> 
> > 
> > 
> > ----- Joseph Fernandes <josferna at redhat.com> wrote:
> > > Hi All,
> > > 
> > > As I am working on shrinking the CTR DB Size, I came across few of the
> > > articles/blogs on this.
> > > As predicted, saving the UUID as 16 byte rather than 36 byte text will
> > > give
> > > us atleast 46% reduction
> > > in disk and cache space. Plus The blog do suggest some performance
> > > gain(if
> > > we don't often convert UUID to String, whi).
> > > 
> > > http://www.google.com/url?q=http%3A%2F%2Fwtanaka.com%2Fnode%2F8106&sa=D&sntz=1&usg=AFQjCNEZolVlLAW2OGxq96CFjfeY0mQC1A
> > > https://scion.duhs.duke.edu/vespa/project/wiki/DatabaseUuidEfficiency
> > > 
> > > The changes in the current libgfdb code is at the sqlite level. But since
> > > there is a change in schema, we need to write
> > > db data migration scripts during upgrades (Similar to dual connection
> > > path). Speaking of which, we would need DB schema versions
> > > and need to have it stored in gluster (either on glusterd or db or
> > > namespace), As we will expect the schema to change as we fine
> > > tune our heat store.
> > > 
> > > Regards,
> > > Joe
> > > 
> > 
> > 
> 
>