[Gluster-devel] geo-rep regression because of node-uuid change
Pranith Kumar Karampuri
pkarampu at redhat.com
Tue Jun 20 09:05:42 UTC 2017
Adding more people to get a consensus about this.
On Tue, Jun 20, 2017 at 1:49 PM, Aravinda <avishwan at redhat.com> wrote:
>
> regards
> Aravinda VK
>
>
> On 06/20/2017 01:26 PM, Xavier Hernandez wrote:
>
>> Hi Pranith,
>>
>> adding gluster-devel, Kotresh and Aravinda,
>>
>> On 20/06/17 09:45, Pranith Kumar Karampuri wrote:
>>
>>>
>>>
>>> On Tue, Jun 20, 2017 at 1:12 PM, Xavier Hernandez <xhernandez at datalab.es
>>> <mailto:xhernandez at datalab.es>> wrote:
>>>
>>> On 20/06/17 09:31, Pranith Kumar Karampuri wrote:
>>>
>>> The way geo-replication works is:
>>> On each machine, it does getxattr of node-uuid and check if its
>>> own uuid
>>> is present in the list. If it is present then it will consider
>>> it active
>>> otherwise it will be considered passive. With this change we are
>>> giving
>>> all uuids instead of first-up subvolume. So all machines think
>>> they are
>>> ACTIVE which is bad apparently. So that is the reason. Even I
>>> felt bad
>>> that we are doing this change.
>>>
>>>
>>> And what about changing the content of node-uuid to include some
>>> sort of hierarchy ?
>>>
>>> for example:
>>>
>>> a single brick:
>>>
>>> NODE(<guid>)
>>>
>>> AFR/EC:
>>>
>>> AFR[2](NODE(<guid>), NODE(<guid>))
>>> EC[3,1](NODE(<guid>), NODE(<guid>), NODE(<guid>))
>>>
>>> DHT:
>>>
>>> DHT[2](AFR[2](NODE(<guid>), NODE(<guid>)), AFR[2](NODE(<guid>),
>>> NODE(<guid>)))
>>>
>>> This gives a lot of information that can be used to take the
>>> appropriate decisions.
>>>
>>>
>>> I guess that is not backward compatible. Shall I CC gluster-devel and
>>> Kotresh/Aravinda?
>>>
>>
>> Is the change we did backward compatible ? if we only require the first
>> field to be a GUID to support backward compatibility, we can use something
>> like this:
>>
> No. But the necessary change can be made to Geo-rep code as well if format
> is changed, Since all these are built/shipped together.
>
> Geo-rep uses node-id as follows,
>
> list = listxattr(node-uuid)
> active_node_uuids = list.split(SPACE)
> active_node_flag = True if self.node_id exists in active_node_uuids else
> False
>
>
>
>> Bricks:
>>
>> <guid>
>>
>> AFR/EC:
>> <guid>(<guid>, <guid>)
>>
>> DHT:
>> <guid>(<guid>(<guid>, ...), <guid>(<guid>, ...))
>>
>> In this case, AFR and EC would return the same <guid> they returned
>> before the patch, but between '(' and ')' they put the full list of guid's
>> of all nodes. The first <guid> can be used by geo-replication. The list
>> after the first <guid> can be used for rebalance.
>>
>> Not sure if there's any user of node-uuid above DHT.
>>
>> Xavi
>>
>>
>>>
>>>
>>> Xavi
>>>
>>>
>>> On Tue, Jun 20, 2017 at 12:46 PM, Xavier Hernandez
>>> <xhernandez at datalab.es <mailto:xhernandez at datalab.es>
>>> <mailto:xhernandez at datalab.es <mailto:xhernandez at datalab.es>>>
>>> wrote:
>>>
>>> Hi Pranith,
>>>
>>> On 20/06/17 07:53, Pranith Kumar Karampuri wrote:
>>>
>>> hi Xavi,
>>> We all made the mistake of not sending about
>>> changing
>>> behavior of
>>> node-uuid xattr so that rebalance can use multiple nodes
>>> for doing
>>> rebalance. Because of this on geo-rep all the workers
>>> are becoming
>>> active instead of one per EC/AFR subvolume. So we are
>>> frantically trying
>>> to restore the functionality of node-uuid and introduce
>>> a new
>>> xattr for
>>> the new behavior. Sunil will be sending out a patch for
>>> this.
>>>
>>>
>>> Wouldn't it be better to change geo-rep behavior to use the
>>> new data
>>> ? I think it's better as it's now, since it gives more
>>> information
>>> to upper layers so that they can take more accurate
>>> decisions.
>>>
>>> Xavi
>>>
>>>
>>> --
>>> Pranith
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Pranith
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Pranith
>>>
>>
>>
>
--
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170620/0c0f505d/attachment-0001.html>
More information about the Gluster-devel
mailing list