[Bugs] [Bug 1312878] New: Glusterd: Creation of volume is failing if one of the brick is down on the server

bugzilla at redhat.com bugzilla at redhat.com
Mon Feb 29 12:34:29 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1312878

            Bug ID: 1312878
           Summary: Glusterd: Creation of volume is failing if one of the
                    brick is down on the server
           Product: GlusterFS
           Version: 3.7.8
         Component: glusterd
          Assignee: bugs at gluster.org
          Reporter: amukherj at redhat.com
                CC: bsrirama at redhat.com, bugs at gluster.org,
                    ggarg at redhat.com, rmekala at redhat.com,
                    sasundar at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1299432, 1299710



+++ This bug was initially created as a clone of Bug #1299710 +++

+++ This bug was initially created as a clone of Bug #1299432 +++

Description of problem:
===============
Glusterd: Creation of volume is failing if one of the brick is down on the
server 

Version-Release number of selected component (if applicable):
=========


How reproducible:


Steps to Reproduce:
============
1. Make sure one of the brick is down due to XFS crash 
2. Create new volume with other existing bricks but creation of volume is
failing with 

volume create:test_123: failed: Staging failed on
transformers.lab.eng.blr.redhat.com. Error: Brick:
transformers:/rhs/brick7/dv2-3_rajesh_22 not available. Brick may be containing
or be contained by an existing brick
3.

Actual results:


Expected results:


Additional info:
==============


Breakpoint 1, glusterd_is_brickpath_available (uuid=0x7f2b4000d8c0
"Z\323\066^n\020J\b\202\273\361\346\tQ\344\026", path=0x7f2b40009fb0
"/rhs/brick11/test123") at glusterd-utils.c:1166
1166    {
(gdb) n
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1166    {
(gdb)
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1172            char                    tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1172            char                    tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1174            priv = THIS->private;
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1174            priv = THIS->private;
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1178            if (!realpath (path, tmp_path)) {
(gdb)
1179                    if (errno != ENOENT) {
(gdb)
1183                    strncpy(tmp_path,path,PATH_MAX);
(gdb)
1186            cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb) p tmp_path
$1 = "/rhs/brick11/test123", '\000' <repeats 4076 times>
(gdb) n
1200                            if (_is_prefix (tmp_brickpath, tmp_path))
(gdb)
1186            cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb)
1187                    cds_list_for_each_entry (brickinfo, &volinfo->bricks,
(gdb)
1189                            if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb)
1189                            if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb) p brickinfo
$2 = (glusterd_brickinfo_t *) 0x7f2b6d5d1120
(gdb) p brickinfo.hostname
$3 = "transformers.lab.eng.blr.redhat.com", '\000' <repeats 988 times>
(gdb) p brickinfo.path
$4 = "/rhs/brick1/afr1x2_attach_hot", '\000' <repeats 4066 times>
(gdb) n
1192                            if (!realpath (brickinfo->path, tmp_brickpath))
{
(gdb) n
1193                                if (errno == ENOENT)
(gdb) p errno
$5 = 5
(gdb) n
1170            gf_boolean_t            available  = _gf_false;
(gdb)
1207    }
(gdb) p available
$6 = _gf_false
(gdb) 

Filesystem                                              Size  Used Avail Use%
Mounted on
/dev/mapper/rhel_transformers-root                       50G   20G   31G  39% /
devtmpfs                                                 32G     0   32G   0%
/dev
tmpfs                                                    32G   36M   32G   1%
/dev/shm
tmpfs                                                    32G  3.4G   28G  11%
/run
tmpfs                                                    32G     0   32G   0%
/sys/fs/cgroup
/dev/sda1                                               494M  159M  336M  33%
/boot
/dev/mapper/rhel_transformers-home                      477G   13G  464G   3%
/home
tmpfs                                                   6.3G     0  6.3G   0%
/run/user/0
/dev/mapper/RHS_vg1-RHS_lv1                             1.9T  323G  1.5T  18%
/rhs/brick1
/dev/mapper/RHS_vg2-RHS_lv2                             1.9T   57G  1.8T   4%
/rhs/brick2
/dev/mapper/RHS_vg3-RHS_lv3                             1.9T   57G  1.8T   4%
/rhs/brick3
/dev/mapper/RHS_vg4-RHS_lv4                             1.9T   57G  1.8T   4%
/rhs/brick4
/dev/mapper/RHS_vg5-RHS_lv5                             1.9T   57G  1.8T   4%
/rhs/brick5
/dev/mapper/RHS_vg6-RHS_lv6                             1.9T   57G  1.8T   4%
/rhs/brick6
/dev/mapper/RHS_vg7-RHS_lv7                             1.9T  1.4G  1.9T   1%
/rhs/brick7
/dev/mapper/RHS_vg8-RHS_lv8                             1.9T  1.4G  1.9T   1%
/rhs/brick8
/dev/mapper/RHS_vg9-RHS_lv9                             1.9T  1.4G  1.9T   1%
/rhs/brick9
/dev/mapper/RHS_vg10-RHS_lv10                           1.9T  4.2G  1.8T   1%
/rhs/brick10
/dev/mapper/RHS_vg11-RHS_lv11                           1.9T  4.2G  1.8T   1%
/rhs/brick11
/dev/mapper/RHS_vg12-RHS_lv12                           1.9T  4.2G  1.8T   1%
/rhs/brick12
ninja.lab.eng.blr.redhat.com:afr2x2_tier                1.9T  567G  1.3T  31%
/mnt/glusterfs
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31%
/mnt/glusterfs2
ninja.lab.eng.blr.redhat.com:/disperse_vol2             4.2T  3.6T  596G  86%
/mnt/glusterfs_EC
ninja.lab.eng.blr.redhat.com:/disperse_vol2             4.2T  3.6T  596G  86%
/mnt/glusterfs_EC_NO
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31%
/mnt/glusterfs2_new
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31%
/mnt/glusterfs2_new2
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_mod           1.9T  564G  1.3T  31%
/mnt/glusterfs2_mod
ninja.lab.eng.blr.redhat.com:afr2x2_tier_new            1.9T  567G  1.3T  31%
/mnt/afr2x2_tier_new

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-01-18
06:22:44 EST ---

This bug is automatically being proposed for the current z-stream release of
Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from Gaurav Kumar Garg on 2016-01-18 08:15:37 EST ---

Hi Rajesh,

Can you make sure that the brick is already used by another volume means
.glusterfs directory is not there while creating new volume.?

--- Additional comment from Atin Mukherjee on 2016-01-18 09:13:50 EST ---

As Gaurav mentioned in #c2 iIt seems like you have tried to reuse a brick which
is or was earlier used for other gluster volume, that's exactly the error
message says. I strongly believe this is not a bug. Please confirm.

--- Additional comment from Byreddy on 2016-01-18 12:05:25 EST ---

Hi Gaurav,

The issue here is, brick went down because of xfs crash after that he tried to
create a new volume using other bricks in that vm ( not used for any volume ),
it's not allowing to create the new volume with error message mentioned in
description.

Similar XFS crash with gluster link -
http://oss.sgi.com/archives/xfs/2013-01/msg00059.html

Thanks

--- Additional comment from Atin Mukherjee on 2016-01-18 23:43:18 EST ---

After going through the code, it looks like a bug. If realpath () call fails
with an EIO (which indicates the underlying file system of existing bricks may
have some problem) then we return the path is not available instead of skipping
the same brick path

--- Additional comment from Vijay Bellur on 2016-01-19 00:37:06 EST ---

REVIEW: http://review.gluster.org/13258 (glusterd: Skip brickpath validation if
realpath returns EIO) posted (#1) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-19 07:06:25 EST ---

REVIEW: http://review.gluster.org/13258 (glusterd: remove
glusterd_is_brickpath_available () check) posted (#2) for review on master by
Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-21 00:44:45 EST ---

REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#3) for review on
master by Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-02-05 04:54:24 EST ---

REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#4) for review on
master by Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-02-17 07:09:00 EST ---

REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#5) for review on
master by Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-02-22 04:32:32 EST ---

REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for
realpath checks in glusterd_is_brickpath_available) posted (#6) for review on
master by Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-02-29 06:55:31 EST ---

COMMIT: http://review.gluster.org/13258 committed in master by Jeff Darcy
(jdarcy at redhat.com) 
------
commit a60c39de31e8258cb56d8db6bd8ec2491a942a4e
Author: Atin Mukherjee <amukherj at redhat.com>
Date:   Tue Jan 19 10:45:22 2016 +0530

    glusterd: use string comparison for realpath checks in
glusterd_is_brickpath_available

    glusterd_is_brickpath_available () used to call realpath() for checking the
    whether the new brick path matches with the existing ones. The problem with
this
    is if the underlying file system is bad for any one of the existing bricks
then
    realpath() would fail and we wouldn't allow to create the new brick even if
it
    should be allowed.

    Fix is to use string comparison with having a new field real_path in
brickinfo
    to store the absolute path

    Change-Id: I1250ea5345f00fca0f6128056ebd08750d604f0a
    BUG: 1299710
    Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
    Reviewed-on: http://review.gluster.org/13258
    Smoke: Gluster Build System <jenkins at build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Jeff Darcy <jdarcy at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1299432
[Bug 1299432] Glusterd: Creation of volume is failing if one of the brick
is down on the server
https://bugzilla.redhat.com/show_bug.cgi?id=1299710
[Bug 1299710] Glusterd: Creation of volume is failing if one of the brick
is down on the server
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list