[heketi-devel] Remove Device: Used to distribute all the bricks from device to other devices

Mohamed Ashiq Liyazudeen mliyazud at redhat.com
Wed Feb 22 08:31:37 UTC 2017


New commit addresses all the comments. Please Review and comment on the PR. 

Prerequisites, Done: 
We now added VolumeId in BrickEntry and VolumeInfo Executor call which will return Whole information of volume from gluster Itself(instead of saving the brick peer(brickset), we generate the brick peers from this information). 

How does this work: 

For a Device to be remove. 
First If the Device is Empty then Return ok to remove. 
Get the bricklist for bricks in device to be removed and its appropriate volumeEntrylist for bricks. 
Call Replace brick for a volume with the brickId. 

In Replace Brick Logic: 
1)First we Find the BrickSet(a set in which brick belongs, For Example in Distribute-Replicate 2x3 [A(A1,A2,A3), B(B1,B2,B3)], B2 is on set B) in which the brick to be replaced is present. 
Reason to find this is we should not place the brick with another brick of same set(which will cause Quorum to be met if one node is down and also not a good design). 
2) Call the allocator to give out devices for the same cluster. 
3)Ignore the Device IF: 
a)Same Device to be removed 
b)Device belongs to same Node where one of the other bricks in Set is present 
4) With above logic We can still use the logic of simpleAllocator ring to decide the brick placement with single Zone and Multiple zones. 
5) On Failure returns Err and In case of NoSpaceError, We Respond Replacementnotfound. 

Few basic tests added for New VolumeId for BrickEntry and all the failure based on executor.SimpleVolumeInfo change from executor.VolumeInfo has been fixed. 
Kept Device Remove modular so that can be used for Node Remove. 

To Be Done: 
Tests to be Added. 

[1] https://github.com/heketi/heketi/pull/676 

-- Ashiq,Talur 
----- Original Message -----

From: "Luis Pabon" <lpabon at chrysalix.org> 
To: "Mohamed Ashiq Liyazudeen" <mliyazud at redhat.com> 
Cc: heketi-devel at gluster.org 
Sent: Friday, February 17, 2017 1:49:32 AM 
Subject: Re: [heketi-devel] Remove Device: Used to distribute all the bricks from device to other devices 

FYI, unless by some miracle there is no way this feature will be in by Sunday. This feature is one of the hardest part of Heketi which is why https://github.com/heketi/heketi/issues/161 has taken so long. 

The brick set is the heart of this change. A brick set is how Heketi sets up the replicas in a ring. For example: in a distributed replicated 2x3, brick A would need A1 and A2 as replicas. Therefore, A,A1,A2 are a set. Same applies for B,B1,B2. 

Replacing a device which contains B1 (for example), would need a replacement brick which satisfies B and B2 for the set to be complete. Same thing applies for EC where it is A,A1...A(n). 

This is a big change, which requires a good algorithm, execution, and testing. 

- Luis 

On Thu, Feb 16, 2017 at 2:25 PM, Mohamed Ashiq Liyazudeen < mliyazud at redhat.com > wrote: 

Hi Luis, 

I agree on adding the VolumeId part to db for bricks. I didn't get what you mean by brick peers? 

I wanted to know better about the allocator behaviors based on number of zones. If you see our example topology file, It has 4 nodes with multiple devices but 2 nodes are associated to a zone. There are only two zones now and while creating replica three volume how is the allocator creates ring of devices? Mainly in this case we can not ignore both zones. 

Also wanted to know in case of volume expand how are we approaching. I thought it will be using something similar to give the state(where the present brick are) of existing volume to allocator and allocator will give back ring without those zones or nodes. But I think (correct me if I am wrong) Volume is changed by adding appropriate bricks, In the sense replica 3(3x1) is added bricks and made distribute replica 3(3x2). I agree this is the way to go, just trying to understand allocator better. 

We need this feature to be in by Sunday. I will be working on it mostly, Will definitely mail but is there any place to chat with you in case of doubts and quick answers? 

Tomorrow as first thing will add the VolumeId and brick peers(not sure what is it exactly). 


----- Original Message ----- 
From: "Luis Pabon" < lpabon at chrysalix.org > 
To: "Mohamed Ashiq Liyazudeen" < mliyazud at redhat.com > 
Cc: heketi-devel at gluster.org 
Sent: Thursday, February 16, 2017 11:32:55 PM 
Subject: Re: [heketi-devel] Remove Device: Used to distribute all the bricks from device to other devices 

After we agree on the algorithm, the first PR would be to add the necessary 
framework to the DB to support #676. 

- Luis 

On Thu, Feb 16, 2017 at 1:00 PM, Luis Pabon < lpabon at chrysalix.org > wrote: 

> Great summary. Yes, the next step should be to figure out how to enhance 
> the ring to return a brick for another zone. It could be as simple as: 
> If current bricks in set are in different zones: 
> Get a ring 
> Remove disks from the ring in zones already used 
> Return devices until one is found with the appropriate size 
> else: 
> Get a ring 
> Return devices until one is found with the appropriate size 
> Also, order of the disks may matter. This part I am not sure of, but, we 
> may need to make sure of the order of the bricks were added to the volume 
> during 'create'. This may be necessary to determine which of the bricks in 
> the brick set are in different zones. 
> We may have to add a new DB entry in the Brick Entry. For example: Brick 
> peers, and Volume ID 
> - Luis 
> On Wed, Feb 15, 2017 at 2:17 PM, Mohamed Ashiq Liyazudeen < 
> mliyazud at redhat.com > wrote: 
>> Hi, 
>> This mail talks about the PR[1] 
>> Let me start off with what is planned to do in this. 
>> We only support this feature for Replicate and Distribute Replicate 
>> Volume. 
>> Refer: https://gluster.readthedocs.io/en/latest/Administrator%20Gui 
>> de/Managing%20Volumes/#replace-brick 
>> Removes all the brick from the device and start these bricks on other 
>> devices based on allocator. Heal is triggered automatically for replicate 
>> volumes on replace brick. Allocate and create new brick to replace. It 
>> stops the brick to be replaced, If it is not already down(kill the brick 
>> process). Then gluster replace brick which will replace the brick with new 
>> one and also starts the heals. 
>> If other nodes does not have sufficient storage then this command should 
>> fail. 
>> 1) If there are no bricks then tell user, It is clean to remove the 
>> device. 
>> 2) If there are bricks in the device, then find the volume they are 
>> related to from the list of volumes. Brickentry does not have the volume 
>> name it is associated to. 
>> 3) move the bricks to other devices by calling the allocator for the 
>> devices. 
>> 4) eliminate the device to be removed and all the nodes which are 
>> associated the volume already. 
>> We missed on the zone handling part. If there is a way to give the 
>> already used zone and node for the volume to allocator. Then allocator can 
>> return the devices which will be from different zone's node. I think 2,3,4 
>> will handle if there is only one zone. Let us know if there are any other 
>> risks or better ways to use allocator. 
>> [1] https://github.com/heketi/heketi/pull/676 
>> -- 
>> Regards, 
>> Mohamed Ashiq.L 
>> _______________________________________________ 
>> heketi-devel mailing list 
>> heketi-devel at gluster.org 
>> http://lists.gluster.org/mailman/listinfo/heketi-devel 

Mohamed Ashiq.L 

heketi-devel mailing list 
heketi-devel at gluster.org 

Mohamed Ashiq.L 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/heketi-devel/attachments/20170222/f7e206b9/attachment.html>

More information about the heketi-devel mailing list