[Gluster-users] Need to replace a brick on a failed first Gluster node
Greg Scott
GregScott at Infrasupport.com
Sun Jan 22 11:00:17 UTC 2012
Hello -
I am using Glusterfs 3.2.5-2. I have one very small replicated volume
with 2 bricks, as follows:
[root at lme-fw2 ~]# gluster volume info
Volume Name: firewall-scripts
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 192.168.253.1:/gluster-fw1
Brick2: 192.168.253.2:/gluster-fw2
The application is a small active/standby HA appliance and I use the
Gluster volume for config info. The Gluster nodes are also clients and
there are no other clients. Fortunately for me, nothing is in
production yet.
My challenge is, the hard drive at 192.168.253.1 failed. This was the
first Gluster node when I set everything up. I replaced its hard drive
and am rebuilding it. I have a good copy of everything I care about in
the 192.168.253.2 brick. My thought was, I could just remove the old
192.168.253.1 brick and replica, then gluster peer and add it all back
again.
But apparently not so simple:
[root at lme-fw2 ~]# gluster volume remove-brick firewall-scripts
192.168.253.1:/gluster-fw1
Removing brick(s) can result in data loss. Do you want to Continue?
(y/n) y
Incorrect brick 192.168.253.1:/gluster-fw1 for volume firewall-scripts
Not particularly helpful diagnostic info. I also played around with
gluster peer detach/attach, but now I think I may have created a mess:
[root at lme-fw2 ~]# gluster peer probe 192.168.253.1
^C
[root at lme-fw2 ~]# gluster peer status
Number of Peers: 1
Hostname: 192.168.253.1
Uuid: 00000000-0000-0000-0000-000000000000
State: Establishing Connection (Disconnected)
[root at lme-fw2 ~]#
Trying again:
[root at lme-fw2 ~]# gluster peer detach 192.168.253.1
Detach successful
[root at lme-fw2 ~]# gluster peer status
No peers present
[root at lme-fw2 ~]# gluster volume info
Volume Name: firewall-scripts
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 192.168.253.1:/gluster-fw1
Brick2: 192.168.253.2:/gluster-fw2
[root at lme-fw2 ~]# gluster volume remove-brick firewall-scripts
192.168.253.1:/gluster-fw1
Removing brick(s) can result in data loss. Do you want to Continue?
(y/n) y
Incorrect brick 192.168.253.1:/gluster-fw1 for volume firewall-scripts
[root at lme-fw2 ~]#
This should be simple and maybe I am missing something. On the fw2
Gluster node, I want to remove all trace of the old fw1 and then set up
a new fw1 as a new replica. How do I get there from here? Also, once
this goes into production, I will not have the luxury of taking
everything offline and rebuilding it. What is the best way to recover
from a hard drive failure on either node?
Thanks
- Greg Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120122/546e0c1c/attachment.html>
More information about the Gluster-users
mailing list