[Gluster-users] gluster rebalance taking multiple days

Tue Dec 7 00:50:36 UTC 2010

How long should a rebalance take? I know that it depends so lets take this example. 4 servers, 1 brick per server. here is the df -i output from the servers:

[root at ra5 ~]# pdsh -g rack7 "df -i|grep brick"
iosrv-7-1:                      366288896 2720139 363568757    1% /mnt/brick1
iosrv-7-4:                      366288896 3240868 363048028    1% /mnt/brick4
iosrv-7-2:                      366288896 2594165 363694731    1% /mnt/brick2
iosrv-7-3:                      366288896 3267152 363021744    1% /mnt/brick3

So, it looks like there are roughly 10 million files. I have a rebalance running on one of the servers since last Friday and this is what the status looks like right now:

[root at iosrv-7-2 ~]# gluster volume rebalance gluster-test status
rebalance step 1: layout fix in progress: fixed layout 149531740

As a side note I started this rebalance when I noticed that about half of my clients are missing a certain set of files. Upon further investigation I found that a different set of clients are missing different data. This problem happened after many problems getting an upgrade to 3.1.1 working. Unfortunately I don't remember which version was running when I was last able to write to this volume.

Any thoughts?

Mike