[heketi-devel] Remove Device: Used to distribute all the bricks from device to other devices

Luis Pabon lpabon at chrysalix.org
Thu Feb 16 20:19:32 UTC 2017


FYI, unless by some miracle there is no way this feature will be in by
Sunday.  This feature is one of the hardest part of Heketi which is why
https://github.com/heketi/heketi/issues/161 has taken so long.

The brick set is the heart of this change.  A brick set is how Heketi sets
up the replicas in a ring.  For example: in a distributed replicated 2x3,
brick A would need A1 and A2 as replicas.  Therefore, A,A1,A2 are a set.
Same applies for B,B1,B2.

Replacing a device which contains B1 (for example), would need a
replacement brick which satisfies B and B2 for the set to be complete.
Same thing applies for EC where it is A,A1...A(n).

This is a big change, which requires a good algorithm, execution, and
testing.

- Luis

On Thu, Feb 16, 2017 at 2:25 PM, Mohamed Ashiq Liyazudeen <
mliyazud at redhat.com> wrote:

> Hi Luis,
>
> I agree on adding the VolumeId part to db for bricks. I didn't get what
> you mean by brick peers?
>
> I wanted to know better about the allocator behaviors based on number of
> zones. If you see our example topology file, It has 4 nodes with multiple
> devices but 2 nodes are associated to a zone. There are only two zones now
> and while creating replica three volume how is the allocator creates ring
> of devices? Mainly in this case we can not ignore both zones.
>
> Also wanted to know in case of volume expand how are we approaching. I
> thought it will be using something similar to give the state(where the
> present brick are) of existing volume  to allocator and allocator will give
> back ring without those zones or nodes. But I think (correct me if I am
> wrong) Volume is changed by adding appropriate bricks, In the sense replica
> 3(3x1) is added bricks and made distribute replica 3(3x2). I agree this is
> the way to go, just trying to understand allocator better.
>
> We need this feature to be in by Sunday. I will be working on it mostly,
> Will definitely mail but is there any place to chat with you in case of
> doubts and quick answers?
>
> Tomorrow as first thing will add the VolumeId and brick peers(not sure
> what is it exactly).
>
> --
> Ashiq
>
> ----- Original Message -----
> From: "Luis Pabon" <lpabon at chrysalix.org>
> To: "Mohamed Ashiq Liyazudeen" <mliyazud at redhat.com>
> Cc: heketi-devel at gluster.org
> Sent: Thursday, February 16, 2017 11:32:55 PM
> Subject: Re: [heketi-devel] Remove Device: Used to distribute all the
> bricks from device to other devices
>
> After we agree on the algorithm, the first PR would be to add the necessary
> framework to the DB to support #676.
>
> - Luis
>
> On Thu, Feb 16, 2017 at 1:00 PM, Luis Pabon <lpabon at chrysalix.org> wrote:
>
> > Great summary.  Yes, the next step should be to figure out how to enhance
> > the ring to return a brick for another zone.  It could be as simple as:
> >
> > If current bricks in set are in different zones:
> >     Get a ring
> >     Remove disks from the ring in zones already used
> >     Return devices until one is found with the appropriate size
> > else:
> >    Get a ring
> >    Return devices until one is found with the appropriate size
> >
> > Also, order of the disks may matter.  This part I am not sure of, but, we
> > may need to make sure of the order of the bricks were added to the volume
> > during 'create'.  This may be necessary to determine which of the bricks
> in
> > the brick set are in different zones.
> >
> > We may have to add a new DB entry in the Brick Entry.  For example: Brick
> > peers, and Volume ID
> >
> > - Luis
> >
> > On Wed, Feb 15, 2017 at 2:17 PM, Mohamed Ashiq Liyazudeen <
> > mliyazud at redhat.com> wrote:
> >
> >> Hi,
> >>
> >> This mail talks about the PR[1]
> >>
> >> Let me start off with what is planned to do in this.
> >>
> >> We only support this feature for Replicate and Distribute Replicate
> >> Volume.
> >> Refer: https://gluster.readthedocs.io/en/latest/Administrator%20Gui
> >> de/Managing%20Volumes/#replace-brick
> >>
> >> Removes all the brick from the device and start these bricks on other
> >> devices based on allocator. Heal is triggered automatically for
> replicate
> >> volumes on replace brick. Allocate and create new brick to replace. It
> >> stops the brick to be replaced, If it is not already down(kill the brick
> >> process). Then gluster replace brick which will replace the brick with
> new
> >> one and also starts the heals.
> >>
> >> If other nodes does not have sufficient storage then this command should
> >> fail.
> >>
> >> 1) If there are no bricks then tell user, It is clean to remove the
> >> device.
> >> 2) If there are bricks in the device, then find the volume they are
> >> related to from the list of volumes. Brickentry does not have the volume
> >> name it is associated to.
> >> 3) move the bricks to other devices by calling the allocator for the
> >> devices.
> >> 4) eliminate the device to be removed and all the nodes which are
> >> associated the volume already.
> >>
> >> We missed on the zone handling part. If there is a way to give the
> >> already used zone and node for the volume to allocator. Then allocator
> can
> >> return the devices which will be from different zone's node. I think
> 2,3,4
> >> will handle if there is only one zone. Let us know if there are any
> other
> >> risks or better ways to use allocator.
> >>
> >> [1] https://github.com/heketi/heketi/pull/676
> >>
> >> --
> >> Regards,
> >> Mohamed Ashiq.L
> >>
> >> _______________________________________________
> >> heketi-devel mailing list
> >> heketi-devel at gluster.org
> >> http://lists.gluster.org/mailman/listinfo/heketi-devel
> >>
> >
> >
>
> --
> Regards,
> Mohamed Ashiq.L
>
> _______________________________________________
> heketi-devel mailing list
> heketi-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/heketi-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/heketi-devel/attachments/20170216/53f7f13d/attachment-0001.html>


More information about the heketi-devel mailing list