[Bugs] [Bug 1352771] New: [DHT]: Rebalance info for remove brick operation is not showing after glusterd restart
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jul 5 03:44:52 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1352771
Bug ID: 1352771
Summary: [DHT]: Rebalance info for remove brick operation is
not showing after glusterd restart
Product: GlusterFS
Version: 3.8.0
Component: distribute
Keywords: ZStream
Assignee: bugs at gluster.org
Reporter: sabansal at redhat.com
CC: amukherj at redhat.com, bsrirama at redhat.com,
bugs at gluster.org, byarlaga at redhat.com,
kramdoss at redhat.com, nbalacha at redhat.com,
sabansal at redhat.com, sasundar at redhat.com,
smohan at redhat.com, storage-qa-internal at redhat.com
Depends On: 1296796, 1351021
+++ This bug was initially created as a clone of Bug #1351021 +++
+++ This bug was initially created as a clone of Bug #1296796 +++
Description of problem:
=======================
Had two node cluster (node-1 and node-2) with Distributed volume (1*2),
mounted it as fuse and started IO, during IO in progress, started remove brick
operation and restart glusterd on the node which is hosting the brick to
remove,
after glusterd restart there is not rebalance info displaying like
"Rebalanced-files, size, scanned" all the things it's showing as
zeros.
Version-Release number of selected component (if applicable):
==============================================================
glusterfs-3.7.5-14
How reproducible:
=================
Always
Steps to Reproduce:
===================
1.Have a two node cluster (node-1 and node-2)
2.Create a Distributed volume using both the node bricks (1*2)
3.Mounted the volume as Fuse and start IO
4. When IO is in progress, start the remove brick of node-2.
5. Check the remove brick status // it will show the rebalance info
6. Stop and start the glusterd on node-2
7. Check the remove brick status again on both the nodes //it won't show the
rebalance info.
Actual results:
===============
No rebalance info displaying after glusterd restart
Expected results:
=================
It should show Rebalance info even after glusterd restart.
Console log:
============
[root at dhcp42-84 ~]# gluster volume status
Status of volume: Dis
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.70.42.84:/bricks/brick0/abc0 49272 0 Y 2916
Brick 10.70.42.84:/bricks/brick1/abc1 49273 0 Y 2935
Brick 10.70.43.35:/bricks/brick0/abc2 49155 0 Y 30032
NFS Server on localhost 2049 0 Y 3804
NFS Server on 10.70.43.35 2049 0 Y 30324
Task Status of Volume Dis
------------------------------------------------------------------------------
There are no active volume tasks
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 start
volume remove-brick start: success
ID: b2e6507e-838f-4cc4-9061-aa7ba84d9b30
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 102 411.8KB
275 0 0 in progress 4.00
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 140 978.1KB
340 0 0 in progress 6.00
[root at dhcp42-84 ~]#
Stop and Start GlusterD:
========================
[root at dhcp43-35 ~]# systemctl stop glusterd
[root at dhcp43-35 ~]#
[root at dhcp43-35 ~]#
[root at dhcp43-35 ~]# systemctl start glusterd
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 in progress 1.00
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 in progress 2.00
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 in progress 4.00
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 completed 13.00
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 completed 13.00
[root at dhcp42-84 ~]# gluster volume remove-brick Dis
10.70.43.35:/bricks/brick0/abc2 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 completed 13.00
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume info
Volume Name: Dis-Rep
Type: Distributed-Replicate
Volume ID: 69667c02-408f-41a9-b83e-c1684e69ef03
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.42.84:/bricks/brick0/sbr00
Brick2: 10.70.42.84:/bricks/brick1/sbr11
Brick3: 10.70.43.35:/bricks/brick0/sbr22
Brick4: 10.70.43.35:/bricks/brick1/sbr33
Options Reconfigured:
performance.readdir-ahead: on
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume status
Status of volume: Dis-Rep
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.70.42.84:/bricks/brick0/sbr00 49282 0 Y 3129
Brick 10.70.42.84:/bricks/brick1/sbr11 49283 0 Y 3148
Brick 10.70.43.35:/bricks/brick0/sbr22 49165 0 Y 7257
Brick 10.70.43.35:/bricks/brick1/sbr33 49166 0 Y 7276
NFS Server on localhost 2049 0 Y 3170
Self-heal Daemon on localhost N/A N/A Y 3175
NFS Server on 10.70.43.35 2049 0 Y 7298
Self-heal Daemon on 10.70.43.35 N/A N/A Y 7303
Task Status of Volume Dis-Rep
------------------------------------------------------------------------------
There are no active volume tasks
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume
unrecognized command
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis-Rep replica 2
10.70.43.35:/bricks/brick0/sbr22 10.70.43.35:/bricks/brick1/sbr33 start
volume remove-brick start: success
ID: 5ca18e2e-43c9-481f-ab5a-aae02240bb97
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis-Rep replica 2
10.70.43.35:/bricks/brick0/sbr22 10.70.43.35:/bricks/brick1/sbr33 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 50 335.1KB
200 0 0 in progress 4.00
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis-Rep replica 2
10.70.43.35:/bricks/brick0/sbr22 10.70.43.35:/bricks/brick1/sbr33 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 108 548.9KB
372 0 0 in progress 9.00
[root at dhcp42-84 ~]#
<<<<<<<<<Stop and Start Glusterd>>>>>>>>>>
[root at dhcp43-35 ~]# systemctl stop glusterd
[root at dhcp43-35 ~]#
[root at dhcp43-35 ~]#
[root at dhcp43-35 ~]# systemctl start glusterd
[root at dhcp43-35 ~]#
<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>
[root at dhcp42-84 ~]# gluster volume remove-brick Dis-Rep replica 2
10.70.43.35:/bricks/brick0/sbr22 10.70.43.35:/bricks/brick1/sbr33 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 in progress 0.00
[root at dhcp42-84 ~]#
[root at dhcp42-84 ~]# gluster volume remove-brick Dis-Rep replica 2
10.70.43.35:/bricks/brick0/sbr22 10.70.43.35:/bricks/brick1/sbr33 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 in progress 0.00
[root at dhcp42-84 ~]# gluster volume remove-brick Dis-Rep replica 2
10.70.43.35:/bricks/brick0/sbr22 10.70.43.35:/bricks/brick1/sbr33 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 in progress 0.00
[root at dhcp42-84 ~]# gluster volume remove-brick Dis-Rep replica 2
10.70.43.35:/bricks/brick0/sbr22 10.70.43.35:/bricks/brick1/sbr33 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
10.70.43.35 0 0Bytes
0 0 0 in progress 0.00
[root at dhcp42-84 ~]#
Thanks
This bug exists for all types of volume. The issue is that only the rebalance
status is stored in the node_state.info file. On restarting glusterd it is
retrieved and displayed in the status. The other values like rebalance_files,
scanned_files etc are not stored in the node_state.info file and hence not
available for displaying in the status after restarting glusterd.
--- Additional comment from Vijay Bellur on 2016-06-29 03:25:15 EDT ---
REVIEW: http://review.gluster.org/14827 (glusterd: glusterd must store all
rebalance related information) posted (#1) for review on master by Sakshi
Bansal
--- Additional comment from Vijay Bellur on 2016-07-04 04:17:12 EDT ---
REVIEW: http://review.gluster.org/14827 (glusterd: glusterd must store all
rebalance related information) posted (#2) for review on master by Sakshi
Bansal
--- Additional comment from Vijay Bellur on 2016-07-04 08:35:00 EDT ---
COMMIT: http://review.gluster.org/14827 committed in master by Atin Mukherjee
(amukherj at redhat.com)
------
commit 0cd287189e5e9f876022a8c6481195bdc63ce5f8
Author: Sakshi Bansal <sabansal at redhat.com>
Date: Wed Jun 29 12:09:06 2016 +0530
glusterd: glusterd must store all rebalance related information
Change-Id: I8404b864a405411e3af2fbee46ca20330e656045
BUG: 1351021
Signed-off-by: Sakshi Bansal <sabansal at redhat.com>
Reviewed-on: http://review.gluster.org/14827
Smoke: Gluster Build System <jenkins at build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1296796
[Bug 1296796] [DHT]: Rebalance info for remove brick operation is not
showing after glusterd restart
https://bugzilla.redhat.com/show_bug.cgi?id=1351021
[Bug 1351021] [DHT]: Rebalance info for remove brick operation is not
showing after glusterd restart
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list