[Bugs] [Bug 1326174] New: Volume stop is failing when one of brick is down due to underlying filesystem crash

bugzilla at redhat.com bugzilla at redhat.com
Tue Apr 12 03:58:25 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1326174

            Bug ID: 1326174
           Summary: Volume stop is failing when one of brick is down due
                    to underlying filesystem crash
           Product: GlusterFS
           Version: 3.7.10
         Component: glusterd
          Keywords: Triaged
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: amukherj at redhat.com
                CC: bsrirama at redhat.com, bugs at gluster.org
        Depends On: 1325750, 1325841



+++ This bug was initially created as a clone of Bug #1325841 +++

+++ This bug was initially created as a clone of Bug #1325750 +++

Description of problem:
=======================
When one of brick is down due to underlying filesystem (here xfs) crash, volume
stop is failing with error message "volume stop: Dis: failed: brick operations
failed"


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.9-1

How reproducible:
=================
Always


Steps to Reproduce:
===================
1. Have one node cluster
2. Create 1*2 volume and start it.
3. crash underlying filesystem for one of the volume brick using "godown tool"
4. Check brick is down using "volume status"
5. Try to stop the volume // it will fail.


Actual results:
===============
Volume stop is failing with error message "volume stop: Dis: failed: brick
operations failed"

Expected results:
=================
Volume stop should work properly.


Additional info:
================

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-04-11
01:46:29 EDT ---

This bug is automatically being proposed for the current z-stream release of
Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from RHEL Product and Program Management on 2016-04-11
02:02:20 EDT ---

This bug report has Keywords: Regression or TestBlocker.

Since no regressions or test blockers are allowed between releases,
it is also being identified as a blocker for this release.

Please resolve ASAP.

--- Additional comment from Vijay Bellur on 2016-04-11 06:45:07 EDT ---

REVIEW: http://review.gluster.org/13965 (glusterd: populate
brickinfo->real_path conditionally) posted (#1) for review on master by Atin
Mukherjee (amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-04-11 14:39:35 EDT ---

COMMIT: http://review.gluster.org/13965 committed in master by Jeff Darcy
(jdarcy at redhat.com) 
------
commit d129d4eea33aae5db24dba17adcb04e9d4829817
Author: Atin Mukherjee <amukherj at redhat.com>
Date:   Mon Apr 11 16:07:40 2016 +0530

    glusterd: populate brickinfo->real_path conditionally

    glusterd_brickinfo_new_from_brick () is called from multiple places and one
of
    them is glusterd_brick_rpc_notify where its very well possible that an
    underlying brick's file system has crashed and a disconnect event has been
    received. In this case glusterd tries to build the brickinfo from the
brickid in
    the RPC request, however the same fails as
glusterd_brickinfo_new_from_brick ()
    fails from realpath.

    Fix is to skip populating real_path if its a disconnect event.

    Change-Id: I9d9149c64a9cf2247abb731f219c1b1eef037960
    BUG: 1325841
    Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
    Reviewed-on: http://review.gluster.org/13965
    Smoke: Gluster Build System <jenkins at build.gluster.com>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    Reviewed-by: Jeff Darcy <jdarcy at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1325750
[Bug 1325750] Volume stop is failing when one of brick is down due to
underlying filesystem crash
https://bugzilla.redhat.com/show_bug.cgi?id=1325841
[Bug 1325841] Volume stop is failing when one of brick is down due to
underlying filesystem crash
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list