[Gluster-users] Rebalance Issues

Fri Nov 12 07:41:55 UTC 2021

Hi Thomas,
Thank you for your response. Adding the required info below:

Volume Name: data
Type: Distribute
Volume ID: 75410231-bb25-4f14-bcde-caf18fce1d31
Status: Started
Snapshot Count: 0
Number of Bricks: 35
Transport-type: tcp
Bricks:
Brick1: 10.132.1.12:/data/data
Brick2: 10.132.1.12:/data1/data
Brick3: 10.132.1.12:/data2/data
Brick4: 10.132.1.12:/data3/data
Brick5: 10.132.1.13:/data/data
Brick6: 10.132.1.13:/data1/data
Brick7: 10.132.1.13:/data2/data
Brick8: 10.132.1.13:/data3/data
Brick9: 10.132.1.14:/data3/data
Brick10: 10.132.1.14:/data2/data
Brick11: 10.132.1.14:/data1/data
Brick12: 10.132.1.14:/data/data
Brick13: 10.132.1.15:/data/data
Brick14: 10.132.1.15:/data1/data
Brick15: 10.132.1.15:/data2/data
Brick16: 10.132.1.15:/data3/data
Brick17: 10.132.1.16:/data/data
Brick18: 10.132.1.16:/data1/data
Brick19: 10.132.1.16:/data2/data
Brick20: 10.132.1.16:/data3/data
Brick21: 10.132.1.17:/data3/data
Brick22: 10.132.1.17:/data2/data
Brick23: 10.132.1.17:/data1/data
Brick24: 10.132.1.17:/data/data
Brick25: 10.132.1.18:/data/data
Brick26: 10.132.1.18:/data1/data
Brick27: 10.132.1.18:/data2/data
Brick28: 10.132.1.18:/data3/data
Brick29: 10.132.1.19:/data3/data
Brick30: 10.132.1.19:/data2/data
Brick31: 10.132.1.19:/data1/data
Brick32: 10.132.1.19:/data/data
Brick33: 10.132.0.19:/data1/data
Brick34: 10.132.0.19:/data2/data
Brick35: 10.132.0.19:/data/data
Options Reconfigured:
performance.cache-refresh-timeout: 60
performance.cache-size: 8GB
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
storage.health-check-interval: 60
server.keepalive-time: 60
client.keepalive-time: 60
network.ping-timeout: 90

server.event-threads: 2

On Fri, Nov 12, 2021 at 1:08 PM Thomas Bätzler <t.baetzler at bringe.com>
wrote:

> Hello Shreyansh Shah,
>
>
>
> How is your gluster set up? I think it would be very helpful for our
> understanding of your setup to see the output of “gluster v info all”
> annotated with brick sizes.
>
> Otherwise, how could anybody answer your questions?
>
> Best regards,
>
> i.A. Thomas Bätzler
>
> --
>
> BRINGE Informationstechnik GmbH
>
> Zur Seeplatte 12
>
> D-76228 Karlsruhe
>
> Germany
>
>
>
> Fon: +49 721 94246-0
>
> Fon: +49 171 5438457
>
> Fax: +49 721 94246-66
>
> Web: http://www.bringe.de/
>
>
>
> Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe
>
> Ust.Id: DE812936645, HRB 108943 Mannheim
>
>
>
> *Von:* Gluster-users <gluster-users-bounces at gluster.org> *Im Auftrag von *Shreyansh
> Shah
> *Gesendet:* Freitag, 12. November 2021 07:31
> *An:* gluster-users <gluster-users at gluster.org>
> *Betreff:* [Gluster-users] Rebalance Issues
>
>
>
> Hi All,
>
> I have a distributed glusterfs 5.10 setup with 8 nodes and each of them
> having 1 TB disk and 3 disk of 4TB each (so total 22 TB per node).
> Recently I added  a new node with 3 additional disks (1 x 10TB + 2 x 8TB).
> Post this I ran rebalance and it does not seem to complete successfully
> (adding result of gluster volume rebalance data status below). On a few
> nodes it shows failed and on the node it is showing as completed the
> rebalance is not even.
>
> root at gluster6-new:~# gluster v rebalance data status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost            22836         2.4TB
>      136149             1         27664          in progress       14:48:56
>                              10.132.1.15               80         5.0MB
>        1134             3           121               failed        1:08:33
>                              10.132.1.14            18573         2.5TB
>      137827            20         31278          in progress       14:48:56
>                              10.132.1.12              607        61.3MB
>        1667             5            60               failed        1:08:33
>       gluster4.c.storage-186813.internal            26479         2.8TB
>      148402            14         38271          in progress       14:48:56
>                              10.132.1.18               86         6.4MB
>        1094             5            70               failed        1:08:33
>                              10.132.1.17            21953         2.6TB
>      131573             4         26818          in progress       14:48:56
>                              10.132.1.16               56        45.0MB
>        1203             5           111               failed        1:08:33
>                              10.132.0.19             3108         1.9TB
>      224707             2        160148            completed       13:56:31
> Estimated time left for rebalance to complete :       22:04:28
>
>
> Adding 'df -h'  output for the node that has been marked as completed in
> the above status command, the data does not seem to be evenly balanced.
>
> root at gluster-9:~$ df -h /data*
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/bcache0     10T  8.9T  1.1T  90% /data
> /dev/bcache1    8.0T  5.0T  3.0T  63% /data1
> /dev/bcache2    8.0T  5.0T  3.0T  63% /data2
>
>
>
> I would appreciate any help to identify the issues here:
>
> 1. Failures during rebalance.
> 2. Im-balance in data size post gluster rebalance command.
>
> 3. Another thing I would like to mention is that we had to re-balance
> twice as in the initial run one of the new disks on the new node (10 TB),
> got 100% full. Any thoughts as to why this could happen during rebalance?
> The disks on the new node were completely blank disks before rebalance.
> 4. Does glusterfs rebalance data based on percentage used or absolute free
> disk space available?
>
> I can share more details/logs if required. Thanks.
>
> --
>
> Regards,
> Shreyansh Shah
>
> *AlphaGrep Securities Pvt. Ltd.*
>

-- 
Regards,
Shreyansh Shah
*AlphaGrep Securities Pvt. Ltd.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211112/6163b33e/attachment.html>