[Bugs] [Bug 1312878] New: Glusterd: Creation of volume is failing if one of the brick is down on the server
bugzilla at redhat.com
bugzilla at redhat.com
Mon Feb 29 12:34:29 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1312878
Bug ID: 1312878
Summary: Glusterd: Creation of volume is failing if one of the
brick is down on the server
Product: GlusterFS
Version: 3.7.8
Component: glusterd
Assignee: bugs at gluster.org
Reporter: amukherj at redhat.com
CC: bsrirama at redhat.com, bugs at gluster.org,
ggarg at redhat.com, rmekala at redhat.com,
sasundar at redhat.com, storage-qa-internal at redhat.com
Depends On: 1299432, 1299710
+++ This bug was initially created as a clone of Bug #1299710 +++
+++ This bug was initially created as a clone of Bug #1299432 +++
Description of problem:
===============
Glusterd: Creation of volume is failing if one of the brick is down on the
server
Version-Release number of selected component (if applicable):
=========
How reproducible:
Steps to Reproduce:
============
1. Make sure one of the brick is down due to XFS crash
2. Create new volume with other existing bricks but creation of volume is
failing with
volume create:test_123: failed: Staging failed on
transformers.lab.eng.blr.redhat.com. Error: Brick:
transformers:/rhs/brick7/dv2-3_rajesh_22 not available. Brick may be containing
or be contained by an existing brick
3.
Actual results:
Expected results:
Additional info:
==============
Breakpoint 1, glusterd_is_brickpath_available (uuid=0x7f2b4000d8c0
"Z\323\066^n\020J\b\202\273\361\346\tQ\344\026", path=0x7f2b40009fb0
"/rhs/brick11/test123") at glusterd-utils.c:1166
1166 {
(gdb) n
1171 char tmp_path[PATH_MAX+1] = {0};
(gdb)
1166 {
(gdb)
1171 char tmp_path[PATH_MAX+1] = {0};
(gdb)
1172 char tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1176 strncpy (tmp_path, path, PATH_MAX);
(gdb)
1171 char tmp_path[PATH_MAX+1] = {0};
(gdb)
1172 char tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1174 priv = THIS->private;
(gdb)
1176 strncpy (tmp_path, path, PATH_MAX);
(gdb)
1174 priv = THIS->private;
(gdb)
1176 strncpy (tmp_path, path, PATH_MAX);
(gdb)
1178 if (!realpath (path, tmp_path)) {
(gdb)
1179 if (errno != ENOENT) {
(gdb)
1183 strncpy(tmp_path,path,PATH_MAX);
(gdb)
1186 cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb) p tmp_path
$1 = "/rhs/brick11/test123", '\000' <repeats 4076 times>
(gdb) n
1200 if (_is_prefix (tmp_brickpath, tmp_path))
(gdb)
1186 cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb)
1187 cds_list_for_each_entry (brickinfo, &volinfo->bricks,
(gdb)
1189 if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb)
1189 if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb) p brickinfo
$2 = (glusterd_brickinfo_t *) 0x7f2b6d5d1120
(gdb) p brickinfo.hostname
$3 = "transformers.lab.eng.blr.redhat.com", '\000' <repeats 988 times>
(gdb) p brickinfo.path
$4 = "/rhs/brick1/afr1x2_attach_hot", '\000' <repeats 4066 times>
(gdb) n
1192 if (!realpath (brickinfo->path, tmp_brickpath))
{
(gdb) n
1193 if (errno == ENOENT)
(gdb) p errno
$5 = 5
(gdb) n
1170 gf_boolean_t available = _gf_false;
(gdb)
1207 }
(gdb) p available
$6 = _gf_false
(gdb)
Filesystem Size Used Avail Use%
Mounted on
/dev/mapper/rhel_transformers-root 50G 20G 31G 39% /
devtmpfs 32G 0 32G 0%
/dev
tmpfs 32G 36M 32G 1%
/dev/shm
tmpfs 32G 3.4G 28G 11%
/run
tmpfs 32G 0 32G 0%
/sys/fs/cgroup
/dev/sda1 494M 159M 336M 33%
/boot
/dev/mapper/rhel_transformers-home 477G 13G 464G 3%
/home
tmpfs 6.3G 0 6.3G 0%
/run/user/0
/dev/mapper/RHS_vg1-RHS_lv1 1.9T 323G 1.5T 18%
/rhs/brick1
/dev/mapper/RHS_vg2-RHS_lv2 1.9T 57G 1.8T 4%
/rhs/brick2
/dev/mapper/RHS_vg3-RHS_lv3 1.9T 57G 1.8T 4%
/rhs/brick3
/dev/mapper/RHS_vg4-RHS_lv4 1.9T 57G 1.8T 4%
/rhs/brick4
/dev/mapper/RHS_vg5-RHS_lv5 1.9T 57G 1.8T 4%
/rhs/brick5
/dev/mapper/RHS_vg6-RHS_lv6 1.9T 57G 1.8T 4%
/rhs/brick6
/dev/mapper/RHS_vg7-RHS_lv7 1.9T 1.4G 1.9T 1%
/rhs/brick7
/dev/mapper/RHS_vg8-RHS_lv8 1.9T 1.4G 1.9T 1%
/rhs/brick8
/dev/mapper/RHS_vg9-RHS_lv9 1.9T 1.4G 1.9T 1%
/rhs/brick9
/dev/mapper/RHS_vg10-RHS_lv10 1.9T 4.2G 1.8T 1%
/rhs/brick10
/dev/mapper/RHS_vg11-RHS_lv11 1.9T 4.2G 1.8T 1%
/rhs/brick11
/dev/mapper/RHS_vg12-RHS_lv12 1.9T 4.2G 1.8T 1%
/rhs/brick12
ninja.lab.eng.blr.redhat.com:afr2x2_tier 1.9T 567G 1.3T 31%
/mnt/glusterfs
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31%
/mnt/glusterfs2
ninja.lab.eng.blr.redhat.com:/disperse_vol2 4.2T 3.6T 596G 86%
/mnt/glusterfs_EC
ninja.lab.eng.blr.redhat.com:/disperse_vol2 4.2T 3.6T 596G 86%
/mnt/glusterfs_EC_NO
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31%
/mnt/glusterfs2_new
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31%
/mnt/glusterfs2_new2
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_mod 1.9T 564G 1.3T 31%
/mnt/glusterfs2_mod
ninja.lab.eng.blr.redhat.com:afr2x2_tier_new 1.9T 567G 1.3T 31%
/mnt/afr2x2_tier_new
--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-01-18
06:22:44 EST ---
This bug is automatically being proposed for the current z-stream release of
Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'.
If this bug should be proposed for a different release, please manually change
the proposed release flag.
--- Additional comment from Gaurav Kumar Garg on 2016-01-18 08:15:37 EST ---
Hi Rajesh,
Can you make sure that the brick is already used by another volume means
.glusterfs directory is not there while creating new volume.?
--- Additional comment from Atin Mukherjee on 2016-01-18 09:13:50 EST ---
As Gaurav mentioned in #c2 iIt seems like you have tried to reuse a brick which
is or was earlier used for other gluster volume, that's exactly the error
message says. I strongly believe this is not a bug. Please confirm.
--- Additional comment from Byreddy on 2016-01-18 12:05:25 EST ---
Hi Gaurav,
The issue here is, brick went down because of xfs crash after that he tried to
create a new volume using other bricks in that vm ( not used for any volume ),
it's not allowing to create the new volume with error message mentioned in
description.
Similar XFS crash with gluster link -
http://oss.sgi.com/archives/xfs/2013-01/msg00059.html
Thanks
--- Additional comment from Atin Mukherjee on 2016-01-18 23:43:18 EST ---
After going through the code, it looks like a bug. If realpath () call fails
with an EIO (which indicates the underlying file system of existing bricks may
have some problem) then we return the path is not available instead of skipping
the same brick path
--- Additional comment from Vijay Bellur on 2016-01-19 00:37:06 EST ---
REVIEW: http://review.gluster.org/13258 (glusterd: Skip brickpath validation if
realpath returns EIO) posted (#1) for review on master by Atin Mukherjee
(amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-01-19 07:06:25 EST ---
REVIEW: http://review.gluster.org/13258 (glusterd: remove
glusterd_is_brickpath_available () check) posted (#2) for review on master by
Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-01-21 00:44:45 EST ---
REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#3) for review on
master by Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-02-05 04:54:24 EST ---
REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#4) for review on
master by Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-02-17 07:09:00 EST ---
REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#5) for review on
master by Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-02-22 04:32:32 EST ---
REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#6) for review on
master by Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-02-29 06:55:31 EST ---
COMMIT: http://review.gluster.org/13258 committed in master by Jeff Darcy
(jdarcy at redhat.com)
------
commit a60c39de31e8258cb56d8db6bd8ec2491a942a4e
Author: Atin Mukherjee <amukherj at redhat.com>
Date: Tue Jan 19 10:45:22 2016 +0530
glusterd: use string comparison for realpath checks in
glusterd_is_brickpath_available
glusterd_is_brickpath_available () used to call realpath() for checking the
whether the new brick path matches with the existing ones. The problem with
this
is if the underlying file system is bad for any one of the existing bricks
then
realpath() would fail and we wouldn't allow to create the new brick even if
it
should be allowed.
Fix is to use string comparison with having a new field real_path in
brickinfo
to store the absolute path
Change-Id: I1250ea5345f00fca0f6128056ebd08750d604f0a
BUG: 1299710
Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
Reviewed-on: http://review.gluster.org/13258
Smoke: Gluster Build System <jenkins at build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1299432
[Bug 1299432] Glusterd: Creation of volume is failing if one of the brick
is down on the server
https://bugzilla.redhat.com/show_bug.cgi?id=1299710
[Bug 1299710] Glusterd: Creation of volume is failing if one of the brick
is down on the server
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list