[Gluster-users] How to remove a dead node and re-balance volume?

Sun Sep 8 23:53:16 UTC 2013

Joe,

Perhaps a typo

"""So first we move server1:/data/brick2 to server3:/data/brick1""" -
http://joejulian.name/blog/how-to-expand-glusterfs-replicated-clusters-by-one-server/

Should be "server3:/data/brick2"

On Sun, Sep 8, 2013 at 12:34 PM, Joe Julian <joe at julianfamily.org> wrote:

>  On 09/05/2013 02:16 AM, Anup Nair wrote:
>
> On Thu, Sep 5, 2013 at 12:41 AM, Vijay Bellur <vbellur at redhat.com> wrote:
>
>>  On 09/03/2013 01:18 PM, Anup Nair wrote:
>>
>>>  Glusterfs version 3.2.2
>>>
>>> I have a Gluster volume in which one our of the 4 peers/nodes had
>>> crashed some time ago, prior to my joining service here.
>>>
>>> I see from volume info that the crashed (non-existing) node is still
>>> listed as one of the peers and the bricks are also listed. I would like
>>> to detach this node and its bricks and rebalance the volume with
>>> remaining 3 peers. But I am unable to do so. Here are my setps:
>>>
>>> 1. #gluster peer status
>>>    Number of Peers: 3 -- (note: excluding the one I run this command
>>> from)
>>>
>>>    Hostname: dbstore4r294 --- (note: node/peer that is down)
>>>    Uuid: 8bf13458-1222-452c-81d3-565a563d768a
>>>    State: Peer in Cluster (Disconnected)
>>>
>>>    Hostname: 172.16.1.90
>>>    Uuid: 77ebd7e4-7960-4442-a4a4-00c5b99a61b4
>>>    State: Peer in Cluster (Connected)
>>>
>>>    Hostname: dbstore3r294
>>>    Uuid: 23d7a18c-fe57-47a0-afbc-1e1a5305c0eb
>>>    State: Peer in Cluster (Connected)
>>>
>>> 2. #gluster peer detach dbstore4r294
>>>    Brick(s) with the peer dbstore4r294 exist in cluster
>>>
>>> 3. #gluster volume info
>>>
>>>    Volume Name: test-volume
>>>    Type: Distributed-Replicate
>>>    Status: Started
>>>    Number of Bricks: 4 x 2 = 8
>>>    Transport-type: tcp
>>>    Bricks:
>>>    Brick1: dbstore1r293:/datastore1
>>>    Brick2: dbstore2r293:/datastore1
>>>    Brick3: dbstore3r294:/datastore1
>>>    Brick4: dbstore4r294:/datastore1
>>>    Brick5: dbstore1r293:/datastore2
>>>    Brick6: dbstore2r293:/datastore2
>>>    Brick7: dbstore3r294:/datastore2
>>>    Brick8: dbstore4r294:/datastore2
>>>    Options Reconfigured:
>>>    network.ping-timeout: 42s
>>>    performance.cache-size: 64MB
>>>    performance.write-behind-window-size: 3MB
>>>    performance.io-thread-count: 8
>>>    performance.cache-refresh-timeout: 2
>>>
>>> Note that the non-existent node/peer is  -- dbstore4r294 (bricks are
>>> :/datastore1 & /datastore2  - i.e.  brick4 and brick8)
>>>
>>> 4. #gluster volume remove-brick test-volume dbstore4r294:/datastore1
>>>    Removing brick(s) can result in data loss. Do you want to Continue?
>>> (y/n) y
>>>    Remove brick incorrect brick count of 1 for replica 2
>>>
>>> 5. #gluster volume remove-brick test-volume dbstore4r294:/datastore1
>>> dbstore4r294:/datastore2
>>>    Removing brick(s) can result in data loss. Do you want to Continue?
>>> (y/n) y
>>>    Bricks not from same subvol for replica
>>>
>>> How do I remove the peer? What are the steps considering that the node
>>> is non-existent?
>>>  */
>>>
>>
>>
>> Do you plan to replace the dead server with a new server? If so, this
>> could be a possible sequence of steps:
>>
>>
>  No. We are not going to replace it. So, I need to resize it to a 3 node
> cluster.
>
>  I discovered the issue when one of the node hung and I had to reboot it.
> I expected Gluster volume to be available for one node failure. The volume
> was non-responsive.
>
> Surprised at that, I checked the details and found it was running with one
> node missing for many months now, perhaps an year!
>
>  I have no node to replace it with. So, I am looking for a method by
> which I can resize it.
>
>    The problem is that you want to do a replica 2 volume with an odd
> number of servers. This can be done but requires that you think of bricks
> individually rather than tying sets of bricks to servers. Your goal is to
> simply have each pair of replica bricks on two unique servers.
>
> See
> http://joejulian.name/blog/how-to-expand-glusterfs-replicated-clusters-by-one-server/for an example.
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>

-- 
*Religious confuse piety with mere ritual, the virtuous confuse regulation
with outcomes*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130908/8b68d3ad/attachment.html>