[Bugs] [Bug 1141379] Geo-Replication - Fails to handle file renaming correctly between master and slave
bugzilla at redhat.com
bugzilla at redhat.com
Wed Oct 15 10:10:18 UTC 2014
https://bugzilla.redhat.com/show_bug.cgi?id=1141379
Kingsley Tart <gluster at gluster.dogwind.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |gluster at gluster.dogwind.com
--- Comment #6 from Kingsley Tart <gluster at gluster.dogwind.com> ---
I have tested geo-replication on 3.6.0 beta3 on CentOS 6.5 and have managed to
run into what appears to be the same issue.
I posted this to the gluster-users mailing list; my original email is below.
It's worth me adding that since geo-replication broke, if I query the
volume status (in this instance, on my test1 server), I get this:
test1# gluster volume status
Another transaction is in progress. Please try again after sometime.
It's still giving this error, 24 hours later.
Original message posted to list:
I have a small script to simulate file activity for an application we
have. It breaks geo-replication within about 15 - 20 seconds when I try
it.
This is on a small Gluster test environment running in some VMs running
CentOS 6.5 and using gluster 3.6.0 beta3. I have 6 VMs - test1, test2,
test3, test4, test5 and test6. test1, test2 , test3 and test4 are
gluster servers while test5 and test6 are the clients. test3 is actually
not used in this test.
Before the test, I had a single gluster volume as follows:
test1# gluster volume status
Status of volume: gv0
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick test1:/data/brick/gv0 49168 Y 12017
Brick test2:/data/brick/gv0 49168 Y 11835
NFS Server on localhost 2049 Y 12032
Self-heal Daemon on localhost N/A Y 12039
NFS Server on test4 2049 Y 7934
Self-heal Daemon on test4 N/A Y 7939
NFS Server on test3 2049 Y 11768
Self-heal Daemon on test3 N/A Y 11775
NFS Server on test2 2049 Y 11849
Self-heal Daemon on test2 N/A Y 11855
Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks
I created a new volume and set up geo-replication as follows (as these
are test machines I only have one file system on each, hence using
"force" to create the bricks in the root FS):
test4# date ; gluster volume create gv0-slave test4:/data/brick/gv0-slave
force; date
Mon Oct 13 15:03:14 BST 2014
volume create: gv0-slave: success: please start the volume to access data
Mon Oct 13 15:03:15 BST 2014
test4# date ; gluster volume start gv0-slave; date
Mon Oct 13 15:03:36 BST 2014
volume start: gv0-slave: success
Mon Oct 13 15:03:39 BST 2014
test4# date ; gluster volume geo-replication gv0 test4::gv0-slave create
push-pem force ; date
Mon Oct 13 15:05:59 BST 2014
Creating geo-replication session between gv0 & test4::gv0-slave has been
successful
Mon Oct 13 15:06:11 BST 2014
I then mount volume gv0 on one of the client machines. I can create
files within the gv0 volume and can see the changes being replicated to
the gv0-slave volume, so I know that geo-replication is working at the
start.
When I run my script (which quickly creates, deletes and renames files),
geo-replication breaks within a very short time. The test script output
is in
http://gluster.dogwind.com/files/georep20141013/test6_script-output.log
(I interrupted the script once I saw that geo-replication was broken).
Note that when it deletes a file, it renames any later-numbered file so
that the file numbering remains sequential with no gaps; this simulates
a real world application that we use.
If you want a copy of the test script, it's here:
http://gluster.dogwind.com/files/georep20141013/test_script.tar.gz
The various gluster log files can be downloaded from here:
http://gluster.dogwind.com/files/georep20141013/ - each log file has the
actual log file path at the top of the file.
If you want to run the test script on your own system, edit test.pl so
that @mailstores contains a directory path to a gluster volume.
My systems' timezone is BST (GMT+1 / UTC+1) so any timestamps outside of
gluster logs are in this timezone.
Let me know if you need any more info.
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=0ZAPJJhaII&a=cc_unsubscribe
More information about the Bugs
mailing list