[Gluster-devel] Geo-rep: Solving changelog ordering problem!
Kotresh Hiremath Ravishankar
khiremat at redhat.com
Thu Sep 3 06:55:24 UTC 2015
Hi DHT Team and Others,
Changelog is a server side translator sits above POSIX and records FOPs.
Hence, the order of operation is true only for that brick and the order
of operation is lost across bricks.
e.g.,(f1 hashes to brick1 and f2 to brick2)
RENAME f1, f2
>>>>> Re-balance happens, which is very common with Tiering in place<<<<
RENAME f2, f3
The moment re-balance happens, the changelogs related to same entry is distributed
across bricks and since geo-rep sync these changes independently, it is well possible
that it processes in wrong order and end up in inconsistent state in slave.
1. Capture re-balance traffic as well and workout all combinations of FOPs to end
up in correct state. Though we started thinking in these lines, one or the other
corner case does exist and still end up in out of order syncing.
2. The changes related to the 'entry'(file), should always be captured on the first
brick where it recorded initially no matter where the file moves because of re-balance.
This retains the ordering for an entry implicitly and yet geo-rep can sync in distributed
manner from each brick keeping the performance up.
DHT needs to maintain the state for each entry where it was first cached (to be precise,
which brick it gets recorded in changelog) and always notifies changelog the FOP.
I think if can achieve second solution, it would solve geo-rep's out of order syncing
problem for ever.
Let me know your comments and suggestions on this!
Thanks and Regards,
Kotresh H R
More information about the Gluster-devel