[Gluster-users] Possible to preload data on a georeplication target? First sync taking forever...

Tony Maro tonym at evrichart.com
Mon Jul 8 18:04:37 UTC 2013


I have about 4 TB of data in a Gluster mirror configuration on top of ZFS,
mostly consisting of 20KB files.

I've added a georeplication target and the sync started ok.  The target is
using an SSH destination.  It ran pretty quick for a while but it's taken
over 2 weeks to sync just under 1 TB of data to the target server and it
appears to be getting slower.

The two servers are connected to the same switch on a private segment with
Gigabit ethernet, so the bottleneck is not the network.  I haven't
physically moved the georeplication target to the other end of the WAN yet.

I really don't want to wait another 6 weeks (or worse) to get my starting
full sync done before sending the server out.  Is it possible to manually
rsync the data over myself to get it's starting position?

If so, what steps should I take?  In other words, break replication, delete
index, are the special rsync flags I should use if I rsync the data over
myself, etc.?

For reference before anyone asks, the source brick that's running the
georeplication is reporting the following:

top - 14:01:55 up  3:55,  1 user,  load average: 0.31, 0.74, 0.85
Tasks: 221 total,   1 running, 220 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.6%us,  2.9%sy,  0.0%ni, 83.2%id, 10.2%wa,  0.0%hi,  0.1%si,
 0.0%st
Mem:  12297148k total, 12147752k used,   149396k free,    11684k buffers
Swap:    93180k total,        0k used,    93180k free,  3201636k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND


 1711 root      20   0  835m  28m 2484 S  155  0.2  38:25.90 glusterfsd

CPU usage for glusterfsd bounces between around 20% and 160%.

Thanks,
Tony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130708/80d04cf3/attachment.html>


More information about the Gluster-users mailing list