[heketi-devel] Remove Device: Used to distribute all the bricks from device to other devices

Raghavendra Talur rtalur at redhat.com
Thu Mar 9 00:59:39 UTC 2017


Hi Luis,

Please have a look at PR 710 which has changes that you requested.

I have followed the revert of revert model for merge commits as
suggested by Linus in
https://raw.githubusercontent.com/git/git/master/Documentation/howto/revert-a-faulty-merge.txt
for create a new PR.

If you prefer it to be in any other way, please let us know.

Also, these changes don't have API+Async changes and Refactored code
from allocator.
I will send them in a few hours. Meanwhile I wanted to put the simpler
stuff out for review.

Thanks,
Raghavendra Talur

On Wed, Feb 22, 2017 at 2:01 PM, Mohamed Ashiq Liyazudeen
<mliyazud at redhat.com> wrote:
> Hi,
>
> New commit addresses all the comments. Please Review and comment on the PR.
>
> Prerequisites, Done:
> We now added VolumeId in BrickEntry and VolumeInfo Executor call which will
> return Whole information of volume from gluster Itself(instead of saving the
> brick peer(brickset), we generate the brick peers from this information).
>
>
> How does this work:
>
> For a Device to be remove.
> First If the Device is Empty then Return ok to remove.
> Else
> Get the bricklist for bricks in device to be removed and its appropriate
> volumeEntrylist for bricks.
> Call Replace brick for a volume with the brickId.
>
>
> In Replace Brick Logic:
> 1)First we Find the BrickSet(a set in which brick belongs, For Example in
> Distribute-Replicate 2x3 [A(A1,A2,A3), B(B1,B2,B3)], B2 is on set B) in
> which the brick to be replaced is present.
> Reason to find this is we should not place the brick with another brick of
> same set(which will cause Quorum to be met if one node is down and also not
> a good design).
> 2) Call the allocator to give out devices for the same cluster.
> 3)Ignore the Device IF:
> a)Same Device to be removed
> b)Device belongs to same Node where one of the other bricks in Set is
> present
> 4) With above logic We can still use the logic of simpleAllocator ring to
> decide the brick placement with single Zone and Multiple zones.
> 5) On Failure returns Err and In case of NoSpaceError, We Respond
> Replacementnotfound.
>
>
> Note:
> Few basic tests added for New VolumeId for BrickEntry and all the failure
> based on executor.SimpleVolumeInfo change from executor.VolumeInfo has been
> fixed.
> Kept Device Remove modular so that can be used for Node Remove.
>
>
> To Be Done:
> Tests to be Added.
>
>
> [1] https://github.com/heketi/heketi/pull/676
>
> -- Ashiq,Talur
> ________________________________
> From: "Luis Pabon" <lpabon at chrysalix.org>
> To: "Mohamed Ashiq Liyazudeen" <mliyazud at redhat.com>
> Cc: heketi-devel at gluster.org
> Sent: Friday, February 17, 2017 1:49:32 AM
>
> Subject: Re: [heketi-devel] Remove Device: Used to distribute all the bricks
> from device to other devices
>
> FYI, unless by some miracle there is no way this feature will be in by
> Sunday.  This feature is one of the hardest part of Heketi which is why
> https://github.com/heketi/heketi/issues/161 has taken so long.
>
> The brick set is the heart of this change.  A brick set is how Heketi sets
> up the replicas in a ring.  For example: in a distributed replicated 2x3,
> brick A would need A1 and A2 as replicas.  Therefore, A,A1,A2 are a set.
> Same applies for B,B1,B2.
>
> Replacing a device which contains B1 (for example), would need a replacement
> brick which satisfies B and B2 for the set to be complete.  Same thing
> applies for EC where it is A,A1...A(n).
>
> This is a big change, which requires a good algorithm, execution, and
> testing.
>
> - Luis
>
> On Thu, Feb 16, 2017 at 2:25 PM, Mohamed Ashiq Liyazudeen
> <mliyazud at redhat.com> wrote:
>>
>> Hi Luis,
>>
>> I agree on adding the VolumeId part to db for bricks. I didn't get what
>> you mean by brick peers?
>>
>> I wanted to know better about the allocator behaviors based on number of
>> zones. If you see our example topology file, It has 4 nodes with multiple
>> devices but 2 nodes are associated to a zone. There are only two zones now
>> and while creating replica three volume how is the allocator creates ring of
>> devices? Mainly in this case we can not ignore both zones.
>>
>> Also wanted to know in case of volume expand how are we approaching. I
>> thought it will be using something similar to give the state(where the
>> present brick are) of existing volume  to allocator and allocator will give
>> back ring without those zones or nodes. But I think (correct me if I am
>> wrong) Volume is changed by adding appropriate bricks, In the sense replica
>> 3(3x1) is added bricks and made distribute replica 3(3x2). I agree this is
>> the way to go, just trying to understand allocator better.
>>
>> We need this feature to be in by Sunday. I will be working on it mostly,
>> Will definitely mail but is there any place to chat with you in case of
>> doubts and quick answers?
>>
>> Tomorrow as first thing will add the VolumeId and brick peers(not sure
>> what is it exactly).
>>
>> --
>> Ashiq
>>
>> ----- Original Message -----
>> From: "Luis Pabon" <lpabon at chrysalix.org>
>> To: "Mohamed Ashiq Liyazudeen" <mliyazud at redhat.com>
>> Cc: heketi-devel at gluster.org
>> Sent: Thursday, February 16, 2017 11:32:55 PM
>> Subject: Re: [heketi-devel] Remove Device: Used to distribute all the
>> bricks from device to other devices
>>
>> After we agree on the algorithm, the first PR would be to add the
>> necessary
>> framework to the DB to support #676.
>>
>> - Luis
>>
>> On Thu, Feb 16, 2017 at 1:00 PM, Luis Pabon <lpabon at chrysalix.org> wrote:
>>
>> > Great summary.  Yes, the next step should be to figure out how to
>> > enhance
>> > the ring to return a brick for another zone.  It could be as simple as:
>> >
>> > If current bricks in set are in different zones:
>> >     Get a ring
>> >     Remove disks from the ring in zones already used
>> >     Return devices until one is found with the appropriate size
>> > else:
>> >    Get a ring
>> >    Return devices until one is found with the appropriate size
>> >
>> > Also, order of the disks may matter.  This part I am not sure of, but,
>> > we
>> > may need to make sure of the order of the bricks were added to the
>> > volume
>> > during 'create'.  This may be necessary to determine which of the bricks
>> > in
>> > the brick set are in different zones.
>> >
>> > We may have to add a new DB entry in the Brick Entry.  For example:
>> > Brick
>> > peers, and Volume ID
>> >
>> > - Luis
>> >
>> > On Wed, Feb 15, 2017 at 2:17 PM, Mohamed Ashiq Liyazudeen <
>> > mliyazud at redhat.com> wrote:
>> >
>> >> Hi,
>> >>
>> >> This mail talks about the PR[1]
>> >>
>> >> Let me start off with what is planned to do in this.
>> >>
>> >> We only support this feature for Replicate and Distribute Replicate
>> >> Volume.
>> >> Refer: https://gluster.readthedocs.io/en/latest/Administrator%20Gui
>> >> de/Managing%20Volumes/#replace-brick
>> >>
>> >> Removes all the brick from the device and start these bricks on other
>> >> devices based on allocator. Heal is triggered automatically for
>> >> replicate
>> >> volumes on replace brick. Allocate and create new brick to replace. It
>> >> stops the brick to be replaced, If it is not already down(kill the
>> >> brick
>> >> process). Then gluster replace brick which will replace the brick with
>> >> new
>> >> one and also starts the heals.
>> >>
>> >> If other nodes does not have sufficient storage then this command
>> >> should
>> >> fail.
>> >>
>> >> 1) If there are no bricks then tell user, It is clean to remove the
>> >> device.
>> >> 2) If there are bricks in the device, then find the volume they are
>> >> related to from the list of volumes. Brickentry does not have the
>> >> volume
>> >> name it is associated to.
>> >> 3) move the bricks to other devices by calling the allocator for the
>> >> devices.
>> >> 4) eliminate the device to be removed and all the nodes which are
>> >> associated the volume already.
>> >>
>> >> We missed on the zone handling part. If there is a way to give the
>> >> already used zone and node for the volume to allocator. Then allocator
>> >> can
>> >> return the devices which will be from different zone's node. I think
>> >> 2,3,4
>> >> will handle if there is only one zone. Let us know if there are any
>> >> other
>> >> risks or better ways to use allocator.
>> >>
>> >> [1] https://github.com/heketi/heketi/pull/676
>> >>
>> >> --
>> >> Regards,
>> >> Mohamed Ashiq.L
>> >>
>> >> _______________________________________________
>> >> heketi-devel mailing list
>> >> heketi-devel at gluster.org
>> >> http://lists.gluster.org/mailman/listinfo/heketi-devel
>> >>
>> >
>> >
>>
>> --
>> Regards,
>> Mohamed Ashiq.L
>>
>> _______________________________________________
>> heketi-devel mailing list
>> heketi-devel at gluster.org
>> http://lists.gluster.org/mailman/listinfo/heketi-devel
>
>
>
>
>
> --
> Regards,
> Mohamed Ashiq.L
>
>
> _______________________________________________
> heketi-devel mailing list
> heketi-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/heketi-devel
>


More information about the heketi-devel mailing list