[Gluster-users] large memory usage on rebalance

Fri Jan 18 07:09:53 UTC 2013

Hello all,

I had gluster volume distributed over 7 machines with a single brick each. 

After adding two more bricks I started the rebalance command, but the memory usage of the glusterfs process handling this ate up all the available memory on the machine I started the rebalance on.
I tried again on another peer with more memory and here it also ate up 60 GB and showed no sign of getting satisfied when I stopped the rebalance.
In each case it was still doing the layout fix.
I checked the gluster-bug-database but did not find a entry that seemed to cover my situation.

Can I safely roll back from here by removing the bricks and issue a rebalance again?
Does anyone have ideas for how to work around this?
Would adding one drive at a time help?

Some more info:

Gluster version 3.2.6 release 1.el6
CentOS 6.2 or 6.3 x86_64
Each brick has a dedicated partition
Brick filesystem is XFS

The bricks do not all have the same size, but even the smallest partition has space left.

[root at n03 glusterfs]# gluster volume info archive

Volume Name: archive
Type: Distribute
Status: Started
Number of Bricks: 9
Transport-type: tcp
Bricks:
Brick1: n01:/local/archive
Brick2: n02:/local/archive
Brick3: n03:/local/archive
Brick4: n04:/local/archive
Brick5: n05:/local/archive
Brick6: n06:/local/archive
Brick7: n07:/local/archive
Brick8: n08:/local/archive
Brick9: n09:/local/archiv
Options Reconfigured:
nfs.disable: on
features.quota: off

A few minutes after starting the rebalance I get:

[root at n03 glusterfs]# gluster volume rebalance archive status
rebalance step 1: layout fix in progress: fixed layout 3

Cheers,
Frank