[Gluster-users] rebalance and its alternatives

Fri Apr 26 11:21:08 UTC 2013

On 04/26/2013 04:02 PM, Hans Lambermont wrote:
> Hi Vijay,
>
> Vijay Bellur wrote on 20130426:
>> On 04/25/2013 12:57 AM, Hans Lambermont wrote:
>>> Reason i'm looking into this solution is that a regular rebalance just
>>> takes too long. Long as in 100+ days.
>>>
>>> Do you know of other alternatives ? Or to make rebalance start
>>> rebalancing files right away ?

Earlier to GlusterFS 3.3, rebalance always used to happen in two phases:

Phase1: Scan all directories and perform layout (hash range) adjustments.

Phase2: Perform data migration.

This behavior has been improved with 3.3 so that data migration happens 
along with directory layout adjustment. This will also avoid an 
additional crawl of the entire volume which can be expensive. There have 
been additional improvements with rebalancing in 3.3 to optimize the 
amount of data which needs to be migrated.

>>
>> What is the size of your volume?
>
> 26 TiB on 4 nodes with the volume around 60%, 20 M directories, 40 M files.
>
>> Can you please provide details of the glusterfs version and the
>> command that was issued for rebalancing?
>
> I'm now using 3.3.1, the previous rebalance of half the current data was on
> 3.2.5, that took 7 days for the fix-layout and 40 days for the migrate-data
> rebalance. Extrapolating that to today's data amount gives me about 100 days.
>
> The commands I used on 3.2.5 were :
>      gluster volume rebalance volume1 fix-layout start
> and use status until it said "rebalance step 1: layout fix complete" after which I used :
>      gluster volume rebalance volume1 migrate-data start
> I expect to use the same commands on 3.3.1

With 3.3.x, it would be better to use:

gluster volume rebalance start

If you were to use fix-layout and migrate-data separately, it leads you 
to the less than optimal behavior seen with 3.2.5.

>
> I'm about to do a new rebalance, but I first need the open
> filedescriptors leak of https://bugzilla.redhat.com/show_bug.cgi?id=928631
> fixed . I'm going to retry the release-3.3 head as last time I tested that
> I could not write data to the volume :
> https://bugzilla.redhat.com/show_bug.cgi?id=956245 I'm going to retest that.

The patch for avoiding file descriptor leak has been merged today in 
release-3.3.

I would be interested to know if you observe 956245 from the head of 
release-3.3 again.

-Vijay