[Gluster-users] Replica 3 - how to replace failed node (peer)

Karthik Subrahmanya ksubrahm at redhat.com
Thu Apr 11 10:45:52 UTC 2019


On Thu, Apr 11, 2019 at 1:40 PM Strahil Nikolov <hunter86_bg at yahoo.com>
wrote:

> Hi Karthik,
>
> - the volume configuration you were using?
> I used oVirt 4.2.6 Gluster Wizard, so I guess - we need to involve the
> oVirt devs here.
> - why you wanted to replace your brick?
> I have deployed the arbiter on another location as I thought I can deploy
> the Thin Arbiter (still waiting the docs to be updated), but once I
> realized that GlusterD doesn't support Thin Arbiter, I had to build another
> machine for a local arbiter - thus a replacement was needed.
>
We are working on supporting Thin-arbiter with GlusterD. Once done, we will
update on the users list so that you can play with it and let us know your
experience.

> - which brick(s) you tried replacing?
> I was replacing the old arbiter with a new one
> - what problem(s) did you face?
> All oVirt VMs got paused due to I/O errors.
>
There could be many reasons for this. Without knowing the exact state of
the system at that time, I am afraid to make any comment on this.

>
> At the end, I have rebuild the whole setup and I never tried to replace
> the brick this way (used only reset-brick which didn't cause any issues).
>
> As I mentioned that was on v3.12, which is not the default for oVirt
> 4.3.x - so my guess is that it is OK now (current is v5.5).
>
I don't remember anyone complaining about this recently. This should work
in the latest releases.

>
> Just sharing my experience.
>
Highly appreciated.

Regards,
Karthik

>
> Best Regards,
> Strahil Nikolov
>
> В четвъртък, 11 април 2019 г., 0:53:52 ч. Гринуич-4, Karthik Subrahmanya <
> ksubrahm at redhat.com> написа:
>
>
> Hi Strahil,
>
> Can you give us some more insights on
> - the volume configuration you were using?
> - why you wanted to replace your brick?
> - which brick(s) you tried replacing?
> - what problem(s) did you face?
>
> Regards,
> Karthik
>
> On Thu, Apr 11, 2019 at 10:14 AM Strahil <hunter86_bg at yahoo.com> wrote:
>
> Hi Karthnik,
> I used only once the brick replace function when I wanted to change my
> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
> Most probably I should have stopped the source arbiter before doing that,
> but the docs didn't mention it.
>
> Thus I always use reset-brick, as it never let me down.
>
> Best Regards,
> Strahil Nikolov
> On Apr 11, 2019 07:34, Karthik Subrahmanya <ksubrahm at redhat.com> wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the
> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
> to go with replace-brick with the force option if he wants to use the same
> path & name for the new brick.
> Yes, it is recommended to have the new brick to be of the same size as
> that of the other bricks.
>
> [1]
> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>
> Regards,
> Karthik
>
> On Wed, Apr 10, 2019 at 10:31 PM Strahil <hunter86_bg at yahoo.com> wrote:
>
> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth <snowmailer at gmail.com>
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190411/940d6831/attachment.html>


More information about the Gluster-users mailing list