[Bugs] [Bug 1475399] New: Rebalance estimate time sometimes shows negative values
bugzilla at redhat.com
bugzilla at redhat.com
Wed Jul 26 14:54:42 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1475399
Bug ID: 1475399
Summary: Rebalance estimate time sometimes shows negative
values
Product: GlusterFS
Version: 3.12
Component: distribute
Severity: medium
Assignee: bugs at gluster.org
Reporter: nbalacha at redhat.com
CC: amukherj at redhat.com, bugs at gluster.org,
rhinduja at redhat.com, rhs-bugs at redhat.com,
storage-qa-internal at redhat.com, tdesala at redhat.com
Depends On: 1454602, 1457985
Blocks: 1460914, 1460894
+++ This bug was initially created as a clone of Bug #1457985 +++
+++ This bug was initially created as a clone of Bug #1454602 +++
Description of problem:
=======================
On a cifs mount having a dataset of empty directories+ directories with files,
started removing few bricks. When issued remove-brick status command, rebalance
estimate time shows negative values.
I have issued status for almost 21 times during remove-brick rebalance and
every time it showed negative values. At the 22nd attempt, the rebalance
estimate time showed positive values (at the point, rebalance ran for almost 24
mins)
[root at server1 samba]# gluster v remove-brick distrep server1:/bricks/brick6/b6
server2:/bricks/brick6/b6 server3:/bricks/brick6/b6 server4:/bricks/brick6/b6
status
Node Rebalanced-files size
scanned failures skipped status run time in h:m:s
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
localhost 2 9.5KB
6 0 0 completed 0:15:16
server1.redhat.com 0 0Bytes 0
0 0 in progress 0:21:32
server2.redhat.com 0 0Bytes 0
0 0 in progress 0:00:00
server3.redhat.com 0 0Bytes 0
0 0 in progress 0:21:21
Estimated time left for rebalance to complete : 2023406814:-21:-32
Version-Release number of selected component (if applicable):
3.8.4-25.el7rhgs.x86_64
How reproducible:
=================
1/1
Steps to Reproduce:
===================
1) Create a distributed-replicate volume and start it.
2) cifs mount the volume on a client.
3) Create a data set of empty directories+ directories with files.
4) Remove few bricks.
5) Keep running remove-brick status command and check "Estimated time left for
rebalance to complete " output.
Actual results:
===============
Rebalance estimate time sometimes shows negative values.
Expected results:
=================
Rebalance estimate time should not show negative values.
>From distrep-rebalance.log in
sosreport-sysreg-prod.negativevalues-20170523062944:
[2017-05-23 05:54:01.319951] I [dht-rebalance.c:4425:gf_defrag_status_get]
0-glusterfs: TIME: num_files_lookedup=0,elapsed time =
51.000000,rate_lookedup=0.000000
[2017-05-23 05:54:01.320001] I [dht-rebalance.c:4428:gf_defrag_status_get]
0-glusterfs: TIME: Estimated total time to complete = 0 seconds
[2017-05-23 05:54:01.320012] I [dht-rebalance.c:4431:gf_defrag_status_get]
0-glusterfs: TIME: Seconds left = 18446744073709551565 seconds
This skews the results causing the weird result seen.
Easily reproducible by running rebalance on a volume with only dirs (no files).
--- Additional comment from Worker Ant on 2017-06-01 13:00:56 EDT ---
REVIEW: https://review.gluster.org/17448 (cluster/dht: Include dirs in
rebalance estimates) posted (#1) for review on master by N Balachandran
(nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2017-06-07 00:02:27 EDT ---
COMMIT: https://review.gluster.org/17448 committed in master by Raghavendra G
(rgowdapp at redhat.com)
------
commit c9860430a77f20ddfec532819542bb1d0187c06e
Author: N Balachandran <nbalacha at redhat.com>
Date: Thu Jun 1 22:13:41 2017 +0530
cluster/dht: Include dirs in rebalance estimates
Empty directories were not being considered while
calculating rebalance estimates leading to negative
time-left values being displayed as part of the
rebalance status.
Change-Id: I48d41d702e72db30af10e6b87b628baa605afa98
BUG: 1457985
Signed-off-by: N Balachandran <nbalacha at redhat.com>
Reviewed-on: https://review.gluster.org/17448
Smoke: Gluster Build System <jenkins at build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Amar Tumballi <amarts at redhat.com>
Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
--- Additional comment from Nithya Balachandran on 2017-06-19 02:09:06 EDT ---
Moving this back to Post as there is one more patch required.
--- Additional comment from Worker Ant on 2017-06-19 02:26:48 EDT ---
REVIEW: https://review.gluster.org/17564 (cluster/dht: Additional checks for
rebalance estimates) posted (#1) for review on master by N Balachandran
(nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2017-06-20 08:31:51 EDT ---
COMMIT: https://review.gluster.org/17564 committed in master by Jeff Darcy
(jeff at pl.atyp.us)
------
commit 426c2908aee4addd36d925ee93e56c9b2688ec1b
Author: N Balachandran <nbalacha at redhat.com>
Date: Mon Jun 19 11:50:28 2017 +0530
cluster/dht: Additional checks for rebalance estimates
The rebalance estimates calculation was not handling
calculations correctly when no files had been processed,
i.e., when rate_lookedup was 0.
Now, the estimated time is set to 0 in such scenarios as
there is no way for rebalance to figure out how long the
process will take to complete without knowing the rate at
which the files are being processed.
Change-Id: I7b6378e297e1ba139852bcb2239adf2477336b5b
BUG: 1457985
Signed-off-by: N Balachandran <nbalacha at redhat.com>
Reviewed-on: https://review.gluster.org/17564
Smoke: Gluster Build System <jenkins at build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Amar Tumballi <amarts at redhat.com>
Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
--- Additional comment from Worker Ant on 2017-07-24 09:05:14 EDT ---
REVIEW: https://review.gluster.org/17863 (cluster/dht: Fix negative rebalance
estimates) posted (#1) for review on master by N Balachandran
(nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2017-07-25 13:03:05 EDT ---
REVIEW: https://review.gluster.org/17863 (cluster/dht: Fix negative rebalance
estimates) posted (#2) for review on master by N Balachandran
(nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2017-07-26 05:25:03 EDT ---
COMMIT: https://review.gluster.org/17863 committed in master by Atin Mukherjee
(amukherj at redhat.com)
------
commit e21c915679244ddc1fae886e52badf02b4d95efc
Author: N Balachandran <nbalacha at redhat.com>
Date: Mon Jul 24 18:27:39 2017 +0530
cluster/dht: Fix negative rebalance estimates
The calculation of the rebalance estimates will start
after the rebalance operation has been running for 10
minutes. This patch also changes the cli rebalance status
code to use unsigned variables for the time calculations.
Change-Id: Ic76f517c59ad938a407f1cf5e3b9add571690a6c
BUG: 1457985
Signed-off-by: N Balachandran <nbalacha at redhat.com>
Reviewed-on: https://review.gluster.org/17863
Reviewed-by: Amar Tumballi <amarts at redhat.com>
Smoke: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj at redhat.com>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1454602
[Bug 1454602] Rebalance estimate time sometimes shows negative values
https://bugzilla.redhat.com/show_bug.cgi?id=1457985
[Bug 1457985] Rebalance estimate time sometimes shows negative values
https://bugzilla.redhat.com/show_bug.cgi?id=1460894
[Bug 1460894] Rebalance estimate time sometimes shows negative values
https://bugzilla.redhat.com/show_bug.cgi?id=1460914
[Bug 1460914] Rebalance estimate time sometimes shows negative values
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list