[Bugs] [Bug 1312722] New: DHT - rebalance - when any brick/sub-vol is down and rebalance is not performing any action(fixing lay-out or migrating data) it should not say 'Starting rebalance on volume <vol-name> has been successful' .

Mon Feb 29 03:54:32 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1312722

            Bug ID: 1312722
           Summary: DHT - rebalance - when any brick/sub-vol is down and
                    rebalance is not performing any action(fixing lay-out
                    or migrating data) it should not say 'Starting
                    rebalance on volume <vol-name> has been successful' .
           Product: GlusterFS
           Version: 3.7.9
         Component: distribute
          Keywords: Triaged
          Severity: medium
          Priority: low
          Assignee: bugs at gluster.org
          Reporter: sabansal at redhat.com
                CC: amukherj at redhat.com, bugs at gluster.org,
                    mzywusko at redhat.com, nbalacha at redhat.com,
                    racpatel at redhat.com, rhs-bugs at redhat.com,
                    smohan at redhat.com, vbellur at redhat.com
        Depends On: 1224857

+++ This bug was initially created as a clone of Bug #1224857 +++

+++ This bug was initially created as a clone of Bug #890637 +++

Description of problem:
DHT - rebalance - when any brick/sub-vol is down, rebalance will performing any
action but cli says 'Starting rebalance on volume <vol-name> has been
successful' .

Version-Release number of selected component (if applicable):
3.3.0.5rhs-40

How reproducible:
always

Steps to Reproduce:
1. Create a Distributed volume having 3 or more sub-volumes on multiple server
and start that volume.

2. Fuse Mount the volume from the client-1 using “mount -t glusterfs 
server:/<volume> <client-1_mount_point>”

3. From mount point create some dirs and files inside it.
4. Bring on of the sub-volume down.
[root at localhost ~]# gluster volume status
Status of volume: defect
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick 10.70.35.173:/home/def1                24011    Y    6440
Brick 10.70.35.180:/home/def1                24011    Y    28882
Brick 10.70.35.170:/home/def1                24011    N    27711
NFS Server on localhost                    38467    Y    6608
NFS Server on 10.70.35.170                38467    Y    6153
NFS Server on 10.70.35.173                38467    Y    6446

5. Execute rebalance.
[root at localhost ~]# gluster volume rebalance defect fix-layout start
Starting rebalance on volume defect has been successful

6. check status and log
[root at localhost ~]# gluster volume rebalance defect status
                                    Node Rebalanced-files          size      
scanned      failures         status
                               ---------      -----------   -----------  
-----------   -----------   ------------
                               localhost                0            0         
  0            1         failed
                            10.70.35.173                0            0         
  0            1         failed
                            10.70.35.170                0            0         
  0            1         failed

log:-
[2012-12-28 09:55:48.833293] I [dht-common.c:2337:dht_setxattr] 0-defect-dht:
fixing the layout of /
[2012-12-28 09:55:48.833309] W [dht-selfheal.c:603:dht_fix_layout_of_directory]
0-defect-dht: 1 subvolume(s) are down. Skipping fix layout.

Actual results:
[root at localhost ~]# gluster volume rebalance defect fix-layout start
Starting rebalance on volume defect has been successful

Expected results:
all sub-vol/bricks should be up is basic condition for rebalance. So when one
sub-vol or brick is down, It should give proper message indicating that
rebalance is not started as one of the brick/sub-volume is down rather than
saying it started

Additional info:

--- Additional comment from Rachana Patel on 2012-12-28 06:19:40 EST ---

correction:-
Description of problem:
DHT - rebalance - when any brick/sub-vol is down, rebalance will not performing
any action but cli says 'Starting rebalance on volume <vol-name> has been
successful' .

--- Additional comment from Scott Haines on 2013-09-27 13:07:32 EDT ---

Targeting for 3.0.0 (Denali) release.

--- Additional comment from Vivek Agarwal on 2014-04-07 07:40:57 EDT ---

Per bug triage, between dev, PM and QA, moving these out of denali

--- Additional comment from Anand Avati on 2015-05-27 10:36:38 EDT ---

REVIEW: http://review.gluster.org/10906 (dht: check if all bricks are started
before performing rebalance) posted (#2) for review on master by Sakshi Bansal
(sabansal at redhat.com)

--- Additional comment from Vijay Bellur on 2016-02-03 06:59:43 EST ---

REVIEW: http://review.gluster.org/10906 (glusterd: check if glusterd is started
on all nodes and all           bricks are started before performing rebalance)
posted (#5) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2016-02-28 11:23:54 EST ---

COMMIT: http://review.gluster.org/10906 committed in master by Atin Mukherjee
(amukherj at redhat.com) 
------
commit 368e26f454fe35477e46dc698fa6b8c3c608ea8d
Author: Sakshi <sabansal at redhat.com>
Date:   Tue May 26 09:53:55 2015 +0530

    glusterd: check if glusterd is started on all nodes and all
              bricks are started before performing rebalance

    Change-Id: I458ea9cd86cf35bdb7d758be55f951ae9f3e66f0
    BUG: 1224857
    Signed-off-by: Sakshi <sabansal at redhat.com>
    Reviewed-on: http://review.gluster.org/10906
    Smoke: Gluster Build System <jenkins at build.gluster.com>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    Reviewed-by: Atin Mukherjee <amukherj at redhat.com>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1224857
[Bug 1224857] DHT - rebalance - when any brick/sub-vol is down and
rebalance is not performing any action(fixing lay-out or migrating data) it
should not say 'Starting rebalance on volume <vol-name> has been
successful' .
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.