[Bugs] [Bug 1366817] AFR returns the node uuid of the same node for every file in the replica

Fri May 12 10:16:57 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1366817

--- Comment #14 from Nithya Balachandran <nbalacha at redhat.com> ---
Before answering Xavi's questions, here is an overview of the new approach.

The current DHT rebalance design is as follows:

1. DHT rebalance runs on all nodes, each of which has a node-uuid
2. On startup, each rebalance process determines the list of local subvols for
the node (this is so each rebalance process reduces network reads by migrating
only those files on the same node) by getting node-uuids for the brick root and
comparing it against its own. As only the node uuid of the first brick is
returned, only the node on which the first brick exists will actually have any
local subvols.
4. Each rebalance process lists files on its local nodes and migrates them if
required.

The new approach is as follows:

1. DHT rebalance runs on all nodes, each of which has a node-uuid
2. On startup, each rebalance process determines the list of local subvols for
the node (this is so each rebalance process reduces network reads by migrating
only those files on the same node) by getting node-uuids for the brick root and
comparing it against its own. As all node uuids of every brick in the replica
set are returned, all the nodes will now have local subvols. The node-uuids for
each locla subvols are saved in an array. 
4. Each rebalance process:
- lists files on its local nodes,
- hashes the file gfid for each file
- hash % count of node-uuids to get an index into the node-uuid array. 
- if the node-uuid at that index matches that of the current process, the
process will migrate the file

This requires that all node-uuids be saved in the same order in DHT on all
nodes otherwise there is no guarantee that a particular file will always be
assigned to the same node (we do not want multiple nodes trying to migrate the
same file).

It makes it easier if AFR/EC can return the uuids in the same order as that of
the bricks so it can be extended to use notifications to determine when a brick
goes down.

If a brick is down, but the node is up, I would say return the node-uuid if
available as the other node can still migrate files. I am not sure of how easy
it will be to determine if a node is up.

A node-uuid must be returned for bad bricks in order to keep the selection algo
consistent and allow that node to migrate files.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=YWLNytMPI5&a=cc_unsubscribe