[Gluster-devel] RENAME issues in Geo-replication
Aravinda
avishwan at redhat.com
Fri Sep 19 08:39:31 UTC 2014
Hi All,
Summarized the RENAME issues we have in geo-replication, feel free to
add if I missed any :)
GlusterFS changelogs are stored in each brick, which records the changes
happened in the brick. Georep will run in all the nodes of master and
processes changelogs independently. Processing changelogs is in brick
level, but all the fops will be replayed on mount.
In changelog internal fops are not recorded. For RENAME case only RENAME
is recorded in hashed brick changelog(DHT's Internal fops like creating
linkto file, unlink is not recorded)
We need to start working on fixing these issues to stabilize the
Geo-replication. Comments and Suggestions welcome.
Renamed file falls into other brick
-----------------------------------
Two bricks(distribute)
CREATE f1
RENAME f1 f2 -> f2 falls in other brick
Now race between b1 and b2
In b1 CREATE f1
In b2 RENAME f1 f2
Issue: Actually not an issue. Georep sends stat with RENAME entry ops,
if source itself is not their in slave then Georep will create the
target file using the stat.
We have problem only when RENAME falls in other brick and file is
unlinked in master.
Possible fix: ?
Multiple Renames
----------------
CREATE f1
RENAME f1 f2
RENAME f2 f1
f1 falls in brick1 and f2 falls in brick2, changelogs are
Brick1
CREATE f1
RENAME f2 f1
Brick2
RENAME f1 f2
Issue: If Brick 1 changelogs executed first and then Brick 2, Slave will
have f2.
Possible fix: ?
Active Passive switch in georeplication
---------------------------------------
Setup: Distribute Replica
In any one of the replica,
RENAME recorded in Passive brick, when Active brick was down. When
Active brick comes back it becomes active immediately.
Passive Brick
RENAME
Active Brick
MKNOD (From self heal traffic)
Two issues:
1. If MKNOD is for sticky bit file, MKNOD will create sticky bit file in
slave(renamed file), old named file will be their. Two files with same
GFID, one old file and other one sticky bit file(target name).
2. If MKNOD is actual file, MKNOD will create new file in slave. Slave
will have old file as well as new file with same GFID.
Possible Fix: If a node failed previously, do not become active,
continue with current Passive.(Don't know yet how to do this, as of now
depending on node-uuid we are deciding to become Active/Passive)
RENAME repeat - If two replica bricks are active
------------------------------------------------
From one brick it processes,
CREATE f1
RENAME f1 f2
From other brick it processes same changelogs again,
CREATE f1
RENAME f1 f2
Issue: Slave will have both f1 and f2 with same GFID.
Possible fix: modify MKNOD/CREATE to check disk gfid first and then
create the file. EEXIST when a file exists with same gfid but different
name.
--
regards
Aravinda
http://aravindavk.in
More information about the Gluster-devel
mailing list