[Gluster-users] geo-replication breaks on CentOS 6.5 + gluster 3.6.0 beta3

Tue Oct 14 13:27:13 UTC 2014

It's worth me adding that since geo-replication broke, if I query the
volume status (in this instance, on test1), I get this:

test1# gluster volume status
Another transaction is in progress. Please try again after sometime.

It's still giving this error, 24 hours later.

Cheers,
Kingsley.

On Mon, 2014-10-13 at 16:51 +0100, Kingsley wrote:
> Hi,
> 
> I have a small script to simulate file activity for an application we
> have. It breaks geo-replication within about 15 - 20 seconds when I try
> it.
> 
> This is on a small Gluster test environment running in some VMs running
> CentOS 6.5 and using gluster 3.6.0 beta3. I have 6 VMs - test1, test2,
> test3, test4, test5 and test6. test1, test2 , test3 and test4 are
> gluster servers while test5 and test6 are the clients. test3 is actually
> not used in this test.
> 
> 
> Before the test, I had a single gluster volume as follows:
> 
> test1# gluster volume status
> Status of volume: gv0
> Gluster process                                         Port    Online  Pid
> ------------------------------------------------------------------------------
> Brick test1:/data/brick/gv0                             49168   Y       12017
> Brick test2:/data/brick/gv0                             49168   Y       11835
> NFS Server on localhost                                 2049    Y       12032
> Self-heal Daemon on localhost                           N/A     Y       12039
> NFS Server on test4                                     2049    Y       7934
> Self-heal Daemon on test4                               N/A     Y       7939
> NFS Server on test3                                     2049    Y       11768
> Self-heal Daemon on test3                               N/A     Y       11775
> NFS Server on test2                                     2049    Y       11849
> Self-heal Daemon on test2                               N/A     Y       11855
> 
> Task Status of Volume gv0
> ------------------------------------------------------------------------------
> There are no active volume tasks
> 
> 
> I created a new volume and set up geo-replication as follows (as these
> are test machines I only have one file system on each, hence using
> "force" to create the bricks in the root FS):
> 
> test4# date ; gluster volume create gv0-slave test4:/data/brick/gv0-slave force; date
> Mon Oct 13 15:03:14 BST 2014
> volume create: gv0-slave: success: please start the volume to access data
> Mon Oct 13 15:03:15 BST 2014
> 
> test4# date ; gluster volume start gv0-slave; date
> Mon Oct 13 15:03:36 BST 2014
> volume start: gv0-slave: success
> Mon Oct 13 15:03:39 BST 2014
> 
> test4# date ; gluster volume geo-replication gv0 test4::gv0-slave create push-pem force ; date
> Mon Oct 13 15:05:59 BST 2014
> Creating geo-replication session between gv0 & test4::gv0-slave has been successful
> Mon Oct 13 15:06:11 BST 2014
> 
> 
> I then mount volume gv0 on one of the client machines. I can create
> files within the gv0 volume and can see the changes being replicated to
> the gv0-slave volume, so I know that geo-replication is working at the
> start.
> 
> When I run my script (which quickly creates, deletes and renames files),
> geo-replication breaks within a very short time. The test script output
> is in
> http://gluster.dogwind.com/files/georep20141013/test6_script-output.log
> (I interrupted the script once I saw that geo-replication was broken).
> Note that when it deletes a file, it renames any later-numbered file so
> that the file numbering remains sequential with no gaps; this simulates
> a real world application that we use.
> 
> If you want a copy of the test script, it's here:
> http://gluster.dogwind.com/files/georep20141013/test_script.tar.gz
> 
> 
> The various gluster log files can be downloaded from here:
> http://gluster.dogwind.com/files/georep20141013/ - each log file has the
> actual log file path at the top of the file.
> 
> If you want to run the test script on your own system, edit test.pl so
> that @mailstores contains a directory path to a gluster volume.
> 
> My systems' timezone is BST (GMT+1 / UTC+1) so any timestamps outside of
> gluster logs are in this timezone.
> 
> Let me know if you need any more info.
>