[Gluster-devel] geo-rep regression because of node-uuid change

Tue Jun 20 08:19:38 UTC 2017

regards
Aravinda VK

On 06/20/2017 01:26 PM, Xavier Hernandez wrote:
> Hi Pranith,
>
> adding gluster-devel, Kotresh and Aravinda,
>
> On 20/06/17 09:45, Pranith Kumar Karampuri wrote:
>>
>>
>> On Tue, Jun 20, 2017 at 1:12 PM, Xavier Hernandez <xhernandez at datalab.es
>> <mailto:xhernandez at datalab.es>> wrote:
>>
>>     On 20/06/17 09:31, Pranith Kumar Karampuri wrote:
>>
>>         The way geo-replication works is:
>>         On each machine, it does getxattr of node-uuid and check if its
>>         own uuid
>>         is present in the list. If it is present then it will consider
>>         it active
>>         otherwise it will be considered passive. With this change we are
>>         giving
>>         all uuids instead of first-up subvolume. So all machines think
>>         they are
>>         ACTIVE which is bad apparently. So that is the reason. Even I
>>         felt bad
>>         that we are doing this change.
>>
>>
>>     And what about changing the content of node-uuid to include some
>>     sort of hierarchy ?
>>
>>     for example:
>>
>>     a single brick:
>>
>>     NODE(<guid>)
>>
>>     AFR/EC:
>>
>>     AFR[2](NODE(<guid>), NODE(<guid>))
>>     EC[3,1](NODE(<guid>), NODE(<guid>), NODE(<guid>))
>>
>>     DHT:
>>
>>     DHT[2](AFR[2](NODE(<guid>), NODE(<guid>)), AFR[2](NODE(<guid>),
>>     NODE(<guid>)))
>>
>>     This gives a lot of information that can be used to take the
>>     appropriate decisions.
>>
>>
>> I guess that is not backward compatible. Shall I CC gluster-devel and
>> Kotresh/Aravinda?
>
> Is the change we did backward compatible ? if we only require the 
> first field to be a GUID to support backward compatibility, we can use 
> something like this:
No. But the necessary change can be made to Geo-rep code as well if 
format is changed, Since all these are built/shipped together.

Geo-rep uses node-id as follows,

list = listxattr(node-uuid)
active_node_uuids = list.split(SPACE)
active_node_flag = True if self.node_id exists in active_node_uuids else 
False

>
> Bricks:
>
> <guid>
>
> AFR/EC:
> <guid>(<guid>, <guid>)
>
> DHT:
> <guid>(<guid>(<guid>, ...), <guid>(<guid>, ...))
>
> In this case, AFR and EC would return the same <guid> they returned 
> before the patch, but between '(' and ')' they put the full list of 
> guid's of all nodes. The first <guid> can be used by geo-replication. 
> The list after the first <guid> can be used for rebalance.
>
> Not sure if there's any user of node-uuid above DHT.
>
> Xavi
>
>>
>>
>>
>>     Xavi
>>
>>
>>         On Tue, Jun 20, 2017 at 12:46 PM, Xavier Hernandez
>>         <xhernandez at datalab.es <mailto:xhernandez at datalab.es>
>>         <mailto:xhernandez at datalab.es <mailto:xhernandez at datalab.es>>>
>>         wrote:
>>
>>             Hi Pranith,
>>
>>             On 20/06/17 07:53, Pranith Kumar Karampuri wrote:
>>
>>                 hi Xavi,
>>                        We all made the mistake of not sending about 
>> changing
>>                 behavior of
>>                 node-uuid xattr so that rebalance can use multiple nodes
>>         for doing
>>                 rebalance. Because of this on geo-rep all the workers
>>         are becoming
>>                 active instead of one per EC/AFR subvolume. So we are
>>                 frantically trying
>>                 to restore the functionality of node-uuid and introduce
>>         a new
>>                 xattr for
>>                 the new behavior. Sunil will be sending out a patch for
>>         this.
>>
>>
>>             Wouldn't it be better to change geo-rep behavior to use the
>>         new data
>>             ? I think it's better as it's now, since it gives more
>>         information
>>             to upper layers so that they can take more accurate 
>> decisions.
>>
>>             Xavi
>>
>>
>>                 --
>>                 Pranith
>>
>>
>>
>>
>>
>>         --
>>         Pranith
>>
>>
>>
>>
>>
>> -- 
>> Pranith
>