[Bugs] [Bug 1374135] New: Rebalance is not considering the brick sizes while fixing the layout
bugzilla at redhat.com
bugzilla at redhat.com
Thu Sep 8 03:58:37 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1374135
Bug ID: 1374135
Summary: Rebalance is not considering the brick sizes while
fixing the layout
Product: GlusterFS
Version: 3.8.3
Component: distribute
Assignee: bugs at gluster.org
Reporter: nbalacha at redhat.com
CC: bugs at gluster.org, jdarcy at redhat.com,
mzywusko at redhat.com, nbalacha at redhat.com,
rcyriac at redhat.com, rgowdapp at redhat.com,
rhinduja at redhat.com, smohan at redhat.com,
spalai at redhat.com, storage-qa-internal at redhat.com
Depends On: 1257182, 1366494
+++ This bug was initially created as a clone of Bug #1366494 +++
+++ This bug was initially created as a clone of Bug #1257182 +++
Problem statement:
============================
Rebalance is not considering the brick sizes while fixing the layout of the
volume
Steps/procedure:
1. create a distribute volume using one brick of 100GB .
2. Mount it on the client using FUSE and create directory and 1000 files
3. add brick of 200GB from the another node and run the rebalance from the same
node
Actual results:
================
Though Brick2 is of 200GB, it is holding only 327 and another brick has 676.
Direcotry ranges are given below
[root at rhs-client9 dht4]# getfattr -d -m . -e hex /rhs/brick2/dht4/data
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick2/dht4/data
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x2bcf9f94144a4decb533a419885784cc
trusted.glusterfs.dht=0x0000000100000000aaa972d0ffffffff (200 GB Brick)
[root at rhs-client4 dht4]# getfattr -d -m . -e hex /rhs/brick1/dht4/data
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/dht4/data
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x2bcf9f94144a4decb533a419885784cc
trusted.glusterfs.dht=0x000000010000000000000000aaa972cf (100 GB Brick)
Expected results:
==================
while fixing the layout re-balance should consider the brick sizes
Output:
===================
[root at rhs-client4 dht4]# gluster vol status dht4
Status of volume: dht4
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick rhs-client4.lab.eng.blr.redhat.com:/r
hs/brick1/dht4 49158 0 Y 20117
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick2/dht4 49157 0 Y 29628
NFS Server on localhost 2049 0 Y 20301
NFS Server on rhs-client39.lab.eng.blr.redh
at.com N/A N/A N N/A
NFS Server on rhs-client9.lab.eng.blr.redha
t.com N/A N/A N N/A
Task Status of Volume dht4
------------------------------------------------------------------------------
Task : Rebalance
ID : b93f08b3-e59c-4e30-bd0f-b405e553bdb3
Status : completed
[root at rhs-client9 dht4]# df -h | grep brick2
/dev/mapper/rhel_rhs--client9-vol1 200G 60M 200G 1% /rhs/brick2
[root at rhs-client4 dht4]# df -h | grep brick1
/dev/mapper/rhgs_rhs--client4-vol1 100G 84M 100G 1% /rhs/brick1
--- Additional comment from Red Hat Bugzilla Rules Engine on 2015-08-26
08:33:08 EDT ---
This bug is automatically being proposed for the current z-stream release of
Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'.
If this bug should be proposed for a different release, please manually change
the proposed release flag.
--- Additional comment from Raghavendra G on 2016-06-28 02:42:03 EDT ---
The ranges allocated are:
>>> 0xffffffff - 0xaaa972d0
1431735599
>>> 0xaaa972cf
2863231695
Though the ranges are in the ratio 1:2, they are allocated to wrong bricks.
Large range is allocated to smaller brick. Need to fix it.
--- Additional comment from John Skeoch on 2016-07-13 18:35:18 EDT ---
User rmekala at redhat.com's account has been closed
--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-08-09
07:17:24 EDT ---
Since this bug has been approved for the RHGS 3.2.0 release of Red Hat Gluster
Storage 3, through release flag 'rhgs-3.2.0+', and through the Internal
Whiteboard entry of '3.2.0', the Target Release is being automatically set to
'RHGS 3.2.0'
--- Additional comment from Nithya Balachandran on 2016-08-16 00:51:50 EDT ---
RCA:
The volume was created with a single brick. On adding a second much larger
brick and running a rebalance, the layout is recalculated for all existing
directories by calling dht_fix_layout_of_directory (). This function generates
a new weighted layout in dht_selfheal_layout_new_directory () but then calls
dht_selfheal_layout_maximize_overlap () on the newly generated layout. This
function does not consider the relative brick sizes and as the original brick
had a complete layout (0x00000000-0xffffffff), the layout is swapped to
maximize the overlap with the old layout.
--- Additional comment from Jeff Darcy on 2016-08-16 09:00:27 EDT ---
Nithya's analysis is correct. We generate a new layout based on brick sizes,
then attempt to optimize it for maximum overlap with the current layout. That
optimization is important to minimize data movement, but unfortunately it's
broken in this case because it doesn't account properly for where each range
already resides. I wrote that function BTW, so it's my fault. For now, we
should probably just disable the optimization phase when we're weighting by
brick size. Longer term, what we need to do is fix
dht_selfheal_layout_maximize_overlap. There's a place where it tries to
determine whether a particular swap would be an improvement or not. That
particular calculation needs to be enhance to account for the *actual* current
and proposed locations for a range, instead of (effectively) inferring those
locations from ordinal positions.
--- Additional comment from Worker Ant on 2016-09-06 01:48:47 EDT ---
REVIEW: http://review.gluster.org/15403 (cluster/dht: Skip layout overlap
maximization on weighted rebalance) posted (#1) for review on master by N
Balachandran (nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2016-09-06 13:12:56 EDT ---
REVIEW: http://review.gluster.org/15403 (cluster/dht: Skip layout overlap
maximization on weighted rebalance) posted (#2) for review on master by N
Balachandran (nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2016-09-07 04:11:50 EDT ---
REVIEW: http://review.gluster.org/15403 (cluster/dht: Skip layout overlap
maximization on weighted rebalance) posted (#3) for review on master by N
Balachandran (nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2016-09-07 12:49:46 EDT ---
REVIEW: http://review.gluster.org/15403 (cluster/dht: Skip layout overlap
maximization on weighted rebalance) posted (#4) for review on master by N
Balachandran (nbalacha at redhat.com)
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1257182
[Bug 1257182] Rebalance is not considering the brick sizes while fixing the
layout
https://bugzilla.redhat.com/show_bug.cgi?id=1366494
[Bug 1366494] Rebalance is not considering the brick sizes while fixing the
layout
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list