[Gluster-devel] DHT idea: rebalance-specific layout

James purpleidea at gmail.com
Mon Mar 24 14:37:32 UTC 2014


On Mon, Mar 24, 2014 at 10:08 AM, Jeff Darcy <jdarcy at redhat.com> wrote:
> I was talking to a user about my size-weighted (or optionally
> free-space-weighted) rebalance script.  This led to thinking about ways to
> bring a system back into balance without migrating any old data, as some of
> our users already do.  Here's the example I was using.
>
> * Four existing 1TB bricks, which are 90% full.
>
> * One new 2TB brick, which is empty.
>
> Therefore, total free space is 2.4TB, of which the new brick has 2.0TB. If
> we set up the layouts so that the new brick has 5/6 of the hash space then
> as new files are added they should all reach 100% full at the same time
> without ever needing to migrate any old data.  Yay.
>
> Unfortunately, there's still a problem.  For these kinds of users (e.g.
> CDNs) the newest data also tends to remain hottest.  What happens when they
> want to retire some of their oldest hardware?  That *does* involve migrating
> old data, and the load for that will disproportionately fall on the newest
> servers which really should be spending as much of their time as possible
> serving new content.  That's not good.
>
> So (finally) here's the idea.  Have a *separate* set of layout values that
> are used specifically for rebalance, so that we can rebalance data one way
> even as new files are placed another way.  Let's consider a slightly
> different example.

I think this is a proper clever idea. (Assuming it would work.)

One question: would(n't) there be a chance for "thrashing" (maybe
there's a better word) where new files are getting put on brick X, but
the rebalance is then trying to move them to brick Y? (Well maybe call it a
single thrash, not thrashing.)

As a side note, I don't see this as a high priority feature that I'm
interested in.

>
> * 4 ancient 1TB bricks, 75% full
>
> * 16 medium-age 1.5TB bricks, also 75% full
>
> * 4 new 2TB bricks, empty
>
> Here's one possible way to use dual layouts:
>
> * currently 8TB free on the medium bricks, goal is 5TB
>
> * 4TB free on the new bricks
>
> * set regular layout to 44% new, 55% medium
>
> * set rebalance layout to 100% medium
>
> This way 44% of the new files but *none* of the files from the oldest bricks
> will flow toward the newest bricks.  100% of that traffic will be from the
> oldest bricks to the medium ones, and shouldn't affect the newest machines
> at all.  This would all be a lot easier if we had layout inheritance or
> default layouts instead of every single directory with its own layout, but
> we can probably find ways to deal with that.
>
> Any reactions?
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> https://lists.nongnu.org/mailman/listinfo/gluster-devel




More information about the Gluster-devel mailing list