[Gluster-users] geo-replication breaks on CentOS 6.5 + gluster 3.6.0 beta3
Kingsley
gluster at gluster.dogwind.com
Tue Oct 14 13:27:13 UTC 2014
It's worth me adding that since geo-replication broke, if I query the
volume status (in this instance, on test1), I get this:
test1# gluster volume status
Another transaction is in progress. Please try again after sometime.
It's still giving this error, 24 hours later.
Cheers,
Kingsley.
On Mon, 2014-10-13 at 16:51 +0100, Kingsley wrote:
> Hi,
>
> I have a small script to simulate file activity for an application we
> have. It breaks geo-replication within about 15 - 20 seconds when I try
> it.
>
> This is on a small Gluster test environment running in some VMs running
> CentOS 6.5 and using gluster 3.6.0 beta3. I have 6 VMs - test1, test2,
> test3, test4, test5 and test6. test1, test2 , test3 and test4 are
> gluster servers while test5 and test6 are the clients. test3 is actually
> not used in this test.
>
>
> Before the test, I had a single gluster volume as follows:
>
> test1# gluster volume status
> Status of volume: gv0
> Gluster process Port Online Pid
> ------------------------------------------------------------------------------
> Brick test1:/data/brick/gv0 49168 Y 12017
> Brick test2:/data/brick/gv0 49168 Y 11835
> NFS Server on localhost 2049 Y 12032
> Self-heal Daemon on localhost N/A Y 12039
> NFS Server on test4 2049 Y 7934
> Self-heal Daemon on test4 N/A Y 7939
> NFS Server on test3 2049 Y 11768
> Self-heal Daemon on test3 N/A Y 11775
> NFS Server on test2 2049 Y 11849
> Self-heal Daemon on test2 N/A Y 11855
>
> Task Status of Volume gv0
> ------------------------------------------------------------------------------
> There are no active volume tasks
>
>
> I created a new volume and set up geo-replication as follows (as these
> are test machines I only have one file system on each, hence using
> "force" to create the bricks in the root FS):
>
> test4# date ; gluster volume create gv0-slave test4:/data/brick/gv0-slave force; date
> Mon Oct 13 15:03:14 BST 2014
> volume create: gv0-slave: success: please start the volume to access data
> Mon Oct 13 15:03:15 BST 2014
>
> test4# date ; gluster volume start gv0-slave; date
> Mon Oct 13 15:03:36 BST 2014
> volume start: gv0-slave: success
> Mon Oct 13 15:03:39 BST 2014
>
> test4# date ; gluster volume geo-replication gv0 test4::gv0-slave create push-pem force ; date
> Mon Oct 13 15:05:59 BST 2014
> Creating geo-replication session between gv0 & test4::gv0-slave has been successful
> Mon Oct 13 15:06:11 BST 2014
>
>
> I then mount volume gv0 on one of the client machines. I can create
> files within the gv0 volume and can see the changes being replicated to
> the gv0-slave volume, so I know that geo-replication is working at the
> start.
>
> When I run my script (which quickly creates, deletes and renames files),
> geo-replication breaks within a very short time. The test script output
> is in
> http://gluster.dogwind.com/files/georep20141013/test6_script-output.log
> (I interrupted the script once I saw that geo-replication was broken).
> Note that when it deletes a file, it renames any later-numbered file so
> that the file numbering remains sequential with no gaps; this simulates
> a real world application that we use.
>
> If you want a copy of the test script, it's here:
> http://gluster.dogwind.com/files/georep20141013/test_script.tar.gz
>
>
> The various gluster log files can be downloaded from here:
> http://gluster.dogwind.com/files/georep20141013/ - each log file has the
> actual log file path at the top of the file.
>
> If you want to run the test script on your own system, edit test.pl so
> that @mailstores contains a directory path to a gluster volume.
>
> My systems' timezone is BST (GMT+1 / UTC+1) so any timestamps outside of
> gluster logs are in this timezone.
>
> Let me know if you need any more info.
>
More information about the Gluster-users
mailing list