[Gluster-users] Rebalance Issues

Fri Nov 12 06:31:02 UTC 2021

Hi All,

I have a distributed glusterfs 5.10 setup with 8 nodes and each of them
having 1 TB disk and 3 disk of 4TB each (so total 22 TB per node).
Recently I added  a new node with 3 additional disks (1 x 10TB + 2 x 8TB).
Post this I ran rebalance and it does not seem to complete successfully
(adding result of gluster volume rebalance data status below). On a few
nodes it shows failed and on the node it is showing as completed the
rebalance is not even.

root at gluster6-new:~# gluster v rebalance data status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status  run time in
h:m:s
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost            22836         2.4TB
   136149             1         27664          in progress       14:48:56
                             10.132.1.15               80         5.0MB
     1134             3           121               failed        1:08:33
                             10.132.1.14            18573         2.5TB
   137827            20         31278          in progress       14:48:56
                             10.132.1.12              607        61.3MB
     1667             5            60               failed        1:08:33
      gluster4.c.storage-186813.internal            26479         2.8TB
   148402            14         38271          in progress       14:48:56
                             10.132.1.18               86         6.4MB
     1094             5            70               failed        1:08:33
                             10.132.1.17            21953         2.6TB
   131573             4         26818          in progress       14:48:56
                             10.132.1.16               56        45.0MB
     1203             5           111               failed        1:08:33
                             10.132.0.19             3108         1.9TB
   224707             2        160148            completed       13:56:31
Estimated time left for rebalance to complete :       22:04:28

Adding 'df -h'  output for the node that has been marked as completed in
the above status command, the data does not seem to be evenly balanced.

root at gluster-9:~$ df -h /data*
Filesystem      Size  Used Avail Use% Mounted on
/dev/bcache0     10T  8.9T  1.1T  90% /data
/dev/bcache1    8.0T  5.0T  3.0T  63% /data1
/dev/bcache2    8.0T  5.0T  3.0T  63% /data2

I would appreciate any help to identify the issues here:

1. Failures during rebalance.
2. Im-balance in data size post gluster rebalance command.
3. Another thing I would like to mention is that we had to re-balance twice
as in the initial run one of the new disks on the new node (10 TB), got
100% full. Any thoughts as to why this could happen during rebalance? The
disks on the new node were completely blank disks before rebalance.
4. Does glusterfs rebalance data based on percentage used or absolute free
disk space available?

I can share more details/logs if required. Thanks.

-- 
Regards,
Shreyansh Shah
*AlphaGrep Securities Pvt. Ltd.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211112/7ce5129c/attachment.html>