[Bugs] [Bug 1697866] New: Provide a way to detach a failed node

Tue Apr 9 08:28:17 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1697866

            Bug ID: 1697866
           Summary: Provide a way to detach a failed node
           Product: GlusterFS
           Version: mainline
            Status: NEW
         Component: glusterd
          Severity: low
          Priority: low
          Assignee: bugs at gluster.org
          Reporter: srakonde at redhat.com
                CC: bmekala at redhat.com, bugs at gluster.org,
                    rhs-bugs at redhat.com, rtalur at redhat.com,
                    sankarshan at redhat.com, storage-qa-internal at redhat.com,
                    vbellur at redhat.com
        Depends On: 1696334
  Target Milestone: ---
    Classification: Community

Description of problem:

When a gluster peer node has failed due to hardware issues, it should be
possible to detach it.

Currently, the peer detach command fails because the peer hosts one or more
bricks.

If delete of the volume that has that brick is attempted then volume delete
fails with "Not all peers are up" error.

One way out is to use a replace-brick command and move the brick to some other
node.

However, it might not be possible to replace-brick sometimes. 

A trick that worked for us was to use remove-brick to convert the replica 3
volume to replica 2 and then peer detach the node.

May be the peer detach command can show the trick in output. Something on the
lines:

"This peer has one or more bricks. If the peer is lost and is not recoverable
then you should use either replace-brick or remove-brick procedure to remove
all bricks from the peer and attempt the peer detach again"

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1696334
[Bug 1696334] Provide a way to detach a failed node
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.