[Gluster-users] Is it possible to start geo-replication between two volumes with data already present in the slave?

Mon Dec 15 08:33:03 UTC 2014

When you directly sync files using rsync, rsync creates a file in slave 
if not exists. Due to this, rsync creates a file in slave with different 
GFID than the one available in master. This is a problem for geo-rep to 
continue syncing to slave.

Before starting geo-replication as a prerequisite steps you can do the 
following to fix the GFID changes,

Run this in a master node,
If you downloaded glusterfs source directory,

cd $GLUSTER_SRC/extras/geo-rep
     sh generate-gfid-file.shlocalhost:<MASTER VOL NAME> 
$PWD/get-gfid.sh/tmp/master-gfid-values.txt

Copy the generated file to slave
     scp /tmp/master-gfid-values.txt root at slavehost:/tmp/

Run this script in slave,
     cd $GLUSTER_SRC/extras/geo-rep
     sh slave-upgrade.sh localhost:<SLAVE VOL 
NAME>/tmp/master-gfid-values.txt $PWD/gsync-sync-gfid

Once all these steps complete, GFID in master volume matches with GFID 
in slave.

Now update the stime xattr in each brick root in master volume. Enclosed 
a Python script to update stime of each brick root to current time, run 
it in each master node for each brick.

     sudo python set_stime.py <MASTER VOLUME ID> <SLAVE VOLUME ID> 
<BRICK PATH>

For example,

     sudo python set_stime.py f8c6276f-7ab5-4098-b41d-c82909940799 
563681d7-a8fd-4cea-bf97-eca74203a0fe /exports/brick1

You can get master volume ID and slave volume ID using gluster volume 
info command,

     gluster volume info <MASTER VOL> | grep -i "volume id"
     gluster volume info <SLAVE VOL> | grep -i "volume id"

Once this is done, create the geo-rep session using force option,

     gluster volume geo-replication <MASTERVOL> 
<SLAVEHOST>::<SLAVEVOL>create push-pem force

Start geo-replication,

     gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> 
start force

Now onwards, geo-rep picks only new changes and syncs to slave.

Let me know if you face any issues.

--
regards
Aravinda
http://aravindavk.in

On 12/14/2014 12:32 AM, Nathan Aldridge wrote:
>
> Hi,
>
> I have a large volume that I want to geo-replicate using Gluster (> 
> 1Tb). I have the data rsynced on both servers and up to date. Can I 
> start a geo-replication session without having to send the whole 
> contents over the wire to the slave, since it’s already there? I’m 
> running Gluster 3.6.1.
>
> I’ve read through all the various on-line documents I can find but 
> nothing pops out that describes this scenario.
>
> Thanks in advance,
>
> Nathan Aldridge
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141215/f4ce57e7/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: set_stime.py
Type: text/x-python
Size: 979 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141215/f4ce57e7/attachment.py>