[Gluster-devel] Gluster 5.10 rebalance stuck

Strahil Nikolov hunter86_bg at yahoo.com
Mon Nov 7 08:43:00 UTC 2022


Hi Dev list,
How can I find the details about the rebalance_status/status ids ? Is it actually normal that some systems are in '4' , others in '3' ?
Is it safe to forcefully start a new rebalance ?
Best Regards,Strahil Nikolov  
 
  On Mon, Nov 7, 2022 at 9:15, Shreyansh Shah<shreyansh.shah at alpha-grep.com> wrote:   Hi Strahil,
Adding the info below:

--------------------------------------
Node IP = 10.132.0.19
rebalance_status=1
status=4
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=27054
size=7104425578505
scanned=72141
failures=10
skipped=19611
run-time=92805.000000
--------------------------------------
Node IP = 10.132.0.20
rebalance_status=1
status=4
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=23945
size=7126809216060
scanned=71208
failures=7
skipped=18834
run-time=94029.000000
--------------------------------------
Node IP = 10.132.1.12
rebalance_status=1
status=4
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=12533
size=12945021256
scanned=40398
failures=14
skipped=1194
run-time=92201.000000
--------------------------------------
Node IP = 10.132.1.13
rebalance_status=1
status=3
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=41483
size=8845076025598
scanned=179920
failures=25
skipped=62373
run-time=130017.000000
--------------------------------------
Node IP = 10.132.1.14
rebalance_status=1
status=3
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=43603
size=7834691799355
scanned=204140
failures=2878
skipped=87761
run-time=130016.000000
--------------------------------------
Node IP = 10.132.1.15
rebalance_status=1
status=4
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=29968
size=6389568855140
scanned=69320
failures=7
skipped=17999
run-time=93654.000000
--------------------------------------
Node IP = 10.132.1.16
rebalance_status=1
status=4
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=23226
size=5899338197718
scanned=56169
failures=7
skipped=12659
run-time=94030.000000
--------------------------------------
Node IP = 10.132.1.17
rebalance_status=1
status=4
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=17538
size=6247281008602
scanned=50038
failures=8
skipped=11335
run-time=92203.000000
--------------------------------------
Node IP = 10.132.1.18
rebalance_status=1
status=4
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=20394
size=6395008466977
scanned=50060
failures=7
skipped=13784
run-time=92103.000000
--------------------------------------
Node IP = 10.132.1.19
rebalance_status=1
status=1
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=0
size=0
scanned=0
failures=0
skipped=0
run-time=0.000000
--------------------------------------
Node IP = 10.132.1.20
rebalance_status=1
status=3
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=0
size=0
scanned=24
failures=0
skipped=2
run-time=1514.000000

On Thu, Nov 3, 2022 at 10:10 PM Strahil Nikolov <hunter86_bg at yahoo.com> wrote:

And the other servers ?
 
 
  On Thu, Nov 3, 2022 at 16:21, Shreyansh Shah<shreyansh.shah at alpha-grep.com> wrote:   Hi Strahil,
Thank you for your reply. node_state.info has the below data


root at gluster-11:/usr/var/lib/glusterd/vols/data# cat node_state.info
 rebalance_status=1
status=3
rebalance_op=19
rebalance-id=39a89b51-2549-4348-aa47-0db321c3a32f
rebalanced-files=0
size=0
scanned=24
failures=0
skipped=2
run-time=1514.000000



On Thu, Nov 3, 2022 at 4:00 PM Strahil Nikolov <hunter86_bg at yahoo.com> wrote:

I would check the details in /var/lib/glusterd/vols/<VOLUME_NAME>/node_state.info
Best Regards,Strahil Nikolov 
 
 
  On Wed, Nov 2, 2022 at 9:06, Shreyansh Shah<shreyansh.shah at alpha-grep.com> wrote:   Hi,
I Would really appreciate it if someone would be able to help on the above issue. We are stuck as we cannot run rebalance due to this and thus are not able to extract peak performance from the setup due to unbalanced data.
Adding gluster info (without the bricks) below. Please let me know if any other details/logs are needed.


Volume Name: data
Type: Distribute
Volume ID: 75410231-bb25-4f14-bcde-caf18fce1d31
Status: Started
Snapshot Count: 0
Number of Bricks: 41
Transport-type: tcp
Options Reconfigured:
server.event-threads: 4
network.ping-timeout: 90
client.keepalive-time: 60
server.keepalive-time: 60
storage.health-check-interval: 60
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.cache-size: 8GB
performance.cache-refresh-timeout: 60
cluster.min-free-disk: 3%
client.event-threads: 4
performance.io-thread-count: 16


On Fri, Oct 28, 2022 at 11:40 AM Shreyansh Shah <shreyansh.shah at alpha-grep.com> wrote:

Hi,
We are running glusterfs 5.10 server volume. Recently we added a few new bricks and started a rebalance operation. After a couple of days the rebalance operation was just stuck, with one of the peers showing In-Progress with no file being read/transferred and the rest showing Failed/Completed, so we stopped it using "gluster volume rebalance data stop". Now when we are trying to start it again, we get the below error. Any assistance would be appreciated




root at gluster-11:~# gluster volume rebalance data status
volume rebalance: data: failed: Rebalance not started for volume data.
root at gluster-11:~# gluster volume rebalance data start
volume rebalance: data: failed: Rebalance on data is already started
root at gluster-11:~# gluster volume rebalance data stop
volume rebalance: data: failed: Rebalance not started for volume data.
 

-- 
Regards,
Shreyansh Shah
AlphaGrep Securities Pvt. Ltd.



-- 
Regards,
Shreyansh Shah
AlphaGrep Securities Pvt. Ltd.
  



-- 
Regards,
Shreyansh Shah
AlphaGrep Securities Pvt. Ltd.
  



-- 
Regards,
Shreyansh Shah
AlphaGrep Securities Pvt. Ltd.
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20221107/c32305f3/attachment.html>


More information about the Gluster-devel mailing list