[Bugs] [Bug 1475181] New: dht remove-brick status does not indicate failures files not migrated because of a lack of space
bugzilla at redhat.com
bugzilla at redhat.com
Wed Jul 26 07:41:31 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1475181
Bug ID: 1475181
Summary: dht remove-brick status does not indicate failures
files not migrated because of a lack of space
Product: GlusterFS
Version: 3.12
Component: distribute
Assignee: bugs at gluster.org
Reporter: nbalacha at redhat.com
CC: amukherj at redhat.com, bugs at gluster.org,
rhs-bugs at redhat.com, storage-qa-internal at redhat.com,
tdesala at redhat.com
Depends On: 1474284, 1474318
+++ This bug was initially created as a clone of Bug #1474318 +++
+++ This bug was initially created as a clone of Bug #1474284 +++
Description of problem:
The dht remove-brick operation is expected to treat skipped files as failures
as they are left behind on the removed bricks.
If a file could not be migrated because there was no subvolume that could
accommodate it, the error is ignored because of an incorrect loop counter.
This is a regression from previous releases.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Create a 2x1 distribute volume with 500 MB bricks and create enough files so
that a single brick cannot accommodate all of them
2. Remove the 2nd brick
3. Check the logs and the remove-brick status.
Actual results:
The remove-brick status shows no failures. However the rebalance logs show
messages :
[2017-07-24 09:56:20.191412] W [MSGID: 109033]
[dht-rebalance.c:1021:__dht_check_free_space] 0-vol1-dht: Could not find any
subvol with space accomodating the file - <filename>. Consider adding bricks
Expected results:
The remove-brick status should display non-zero failures as some files cannot
be moved.
Additional info:
The counter used to iterate over the decommissioned bricks array is incorrect
in __dht_check_free_space ().
if (conf->decommission_subvols_cnt) {
*ignore_failure = _gf_true;
for (i = 0; i < conf->decommission_subvols_cnt; i++) {
if (conf->decommissioned_bricks[i] == from) {
*ignore_failure = _gf_false;
break;
}
}
should be
if (conf->decommission_subvols_cnt) {
*ignore_failure = _gf_true;
for (i = 0; i < conf->subvolume_cnt; i++) {
if (conf->decommissioned_bricks[i] == from) {
*ignore_failure = _gf_false;
break;
}
}
--- Additional comment from Worker Ant on 2017-07-24 08:23:33 EDT ---
REVIEW: https://review.gluster.org/17861 (cluster/dht: Correct iterator for
decommissioned bricks) posted (#1) for review on master by N Balachandran
(nbalacha at redhat.com)
--- Additional comment from Worker Ant on 2017-07-25 05:31:40 EDT ---
REVIEW: https://review.gluster.org/17861 (cluster/dht: Correct iterator for
decommissioned bricks) posted (#2) for review on master by Susant Palai
(spalai at redhat.com)
--- Additional comment from Worker Ant on 2017-07-25 06:03:29 EDT ---
COMMIT: https://review.gluster.org/17861 committed in master by N Balachandran
(nbalacha at redhat.com)
------
commit 8c3e766fe0a473734e8eca0f70d0318a2b909e2e
Author: N Balachandran <nbalacha at redhat.com>
Date: Mon Jul 24 17:48:47 2017 +0530
cluster/dht: Correct iterator for decommissioned bricks
Corrected the iterator for looping over the list of
decommissioned bricks while checking if the new target
determined because of min-free-disk values has been
decommissioned.
Change-Id: Iee778547eb7370a8069e954b5d629fcedf54e59b
BUG: 1474318
Signed-off-by: N Balachandran <nbalacha at redhat.com>
Reviewed-on: https://review.gluster.org/17861
Reviewed-by: Susant Palai <spalai at redhat.com>
Smoke: Gluster Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1474284
[Bug 1474284] dht remove-brick status does not indicate failures for files
not migrated because of a lack of space
https://bugzilla.redhat.com/show_bug.cgi?id=1474318
[Bug 1474318] dht remove-brick status does not indicate failures files not
migrated because of a lack of space
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list