[Bugs] [Bug 1366817] AFR returns the node uuid of the same node for every file in the replica

Sat May 13 03:41:01 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1366817

Pranith Kumar K <pkarampu at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(pkarampu at redhat.c |
                   |om)                         |

--- Comment #16 from Pranith Kumar K <pkarampu at redhat.com> ---
(In reply to Nithya Balachandran from comment #14)
> Before answering Xavi's questions, here is an overview of the new approach.
> 
> The current DHT rebalance design is as follows:
> 
> 1. DHT rebalance runs on all nodes, each of which has a node-uuid
> 2. On startup, each rebalance process determines the list of local subvols
> for the node (this is so each rebalance process reduces network reads by
> migrating only those files on the same node) by getting node-uuids for the
> brick root and comparing it against its own. As only the node uuid of the
> first brick is returned, only the node on which the first brick exists will
> actually have any local subvols.
> 4. Each rebalance process lists files on its local nodes and migrates them
> if required.
> 
> 
> 
> The new approach is as follows:
> 
> 1. DHT rebalance runs on all nodes, each of which has a node-uuid
> 2. On startup, each rebalance process determines the list of local subvols
> for the node (this is so each rebalance process reduces network reads by
> migrating only those files on the same node) by getting node-uuids for the
> brick root and comparing it against its own. As all node uuids of every
> brick in the replica set are returned, all the nodes will now have local
> subvols. The node-uuids for each locla subvols are saved in an array. 
> 4. Each rebalance process:
> - lists files on its local nodes,
> - hashes the file gfid for each file
> - hash % count of node-uuids to get an index into the node-uuid array. 
> - if the node-uuid at that index matches that of the current process, the
> process will migrate the file
> 
> 
> This requires that all node-uuids be saved in the same order in DHT on all
> nodes otherwise there is no guarantee that a particular file will always be
> assigned to the same node (we do not want multiple nodes trying to migrate
> the same file).
> 
> It makes it easier if AFR/EC can return the uuids in the same order as that
> of the bricks so it can be extended to use notifications to determine when a
> brick goes down.
> 
> If a brick is down, but the node is up, I would say return the node-uuid if
> available as the other node can still migrate files. I am not sure of how
> easy it will be to determine if a node is up.

At the moment we just give all-zero uuid if getxattr from that brick either
fails or it is down from the beginning.

> 
> A node-uuid must be returned for bad bricks in order to keep the selection
> algo consistent and allow that node to migrate files.

Thanks for the detailed explanation Nitya, I don't think I could have given
such a good response :-).

Xavi,
    We are trying to get this in by 17th May, so one of us will pick this up
for implementation. I am pretty sure you can send this patch in an hour, so if
you have the time to send the patch, just update the bz that you are and we
will be happy to get this in.

Pranith

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=dwVNwIvNUQ&a=cc_unsubscribe