[Gluster-users] Remove a brick, rebuild it, put it back in

Fri Oct 7 03:47:56 UTC 2016

Step 10 isn't really necessary. The changes should probably be monitored
under the brick directory.

On Thu, Oct 6, 2016 at 10:25 PM, Sergei Gerasenko <sgerasenko74 at gmail.com>
wrote:

> I've simulated the problem on 4 VMs in a distributed replicated setup with
> a 2 replica-factor. I've repeatedly torn down and brought up a VM from a
> snapshot in each of my tests.
>
> What has worked so far is this:
>
>
>    1. Make a copy of /var/lib/glusterd from the affected machine, save it
>    elsewhere.
>    2. Configure your new machine (in my case I reverted to a VM
>    snapshot). Assign the same ip and hostname!
>    3. Install gluster.
>    4. Stop the daemons if they are running.
>    5. Nuke the /var/lib/glusterd directory and replace it with the saved
>    copy in step 1.
>    6. Create the brick directory.
>    7. Get the extended volume attribute from a healthy node like so: getfattr
>    -e base64 -n trusted.glusterfs.volume-id /data/brick_dir
>    8. Apply the extended attribute volume id attribute like so: setfattr
>    -n trusted.glusterfs.volume-id -v 'the_value_you_got_in_7==' /data/brick_dir
>    9. Start the daemons.
>    10. FUSE mount the gluster partition through the daemons running
>    locally. So the /etc/fstab would contain something like:
>    localhost:/gluster_volume /mnt/gluster  glusterfs _netdev,defaults  0 0
>    11. On the healthy partner machine with another fuse mount point to
>    the same volume do something like: find /mnt/fuse | xargs stat.
>    12. Step 8 will make files appear under the mount point on the new box
>    but the files are not going to be physically in the brick directory -- yet.
>    See 10.
>    13. Run the heal command from the same host where you ran find. That
>    will finally sync the files to the brick. Run the heal info command
>    periodically and the number of files being healed should eventually go down
>    to 0.
>
> That's my experience with the VMs today.
>
> On Wed, Oct 5, 2016 at 4:46 PM, Joe Julian <joe at julianfamily.org> wrote:
>
>> What I always do is just shut it down, repair (or replace) the brick,
>> then start it up again with "... start $volname force".
>>
>> On October 5, 2016 11:27:36 PM GMT+02:00, Sergei Gerasenko <
>> sgerasenko74 at gmail.com> wrote:
>>>
>>> Hi, sorry if this has been asked before but the documentation is a bit
>>> conflicting in various sources on what to do exactly.
>>>
>>> I have an 6-node, distributed replicated cluster with a replica factor
>>> of 2. So it's 3 pairs of servers. I need to remove a server from one of
>>> those replica sets, rebuild it and put it back in.
>>>
>>> What's the tried and proven sequence of steps for this? Any pointers
>>> would be very useful.
>>>
>>> Thanks!
>>>   Sergei
>>>
>>> ------------------------------
>>>
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161006/e5f667a8/attachment.html>