[Gluster-users] Possible to pre-sync data before geo-rep?

James Le Cuirot chewi at aura-online.co.uk
Fri May 30 09:41:41 UTC 2014


Hi Venky,

On Thu, 29 May 2014 14:09:38 +0530
Venky Shankar <yknev.shankar at gmail.com> wrote:

> There are couple of things here:
> 
> 1. With 3.5, geo-replication would also take care to maintain the
> GFIDs of files to be in sync. Syncing data using rsync this way would
> haves mangled the GFIDs. This is very much similar to an upgrade
> scenario to 3.5 where data synced by geo-rep pre 3.5 would not have
> the GFIDs in sync. So, you would need to follow the upgrade steps
> here[1].

I synced the data before creating volumes from it. Even still, I knew
to exclude the .glusterfs directory and did so when I did a dry-run
rsync to check whether they had somehow diverged after starting the
replication. They hadn't.

I did see a very long string of errors like this in the log though.

[2014-05-27 10:26:49.764568] W [master(/mnt/brick/gv0):250:regjob] <top>: Rsync: .gfid/d2fd40a5-06e6-45b6-bf23-152c39c89f79 [errcode: 23]

I don't know what it means. I can't find a .gfid directory in the brick
on either side.

> 2. Regarding geo-rep replicating already replicated data, as of now
> there is no *easy *way to _tell_ gsyncd to skip hybrid crawl and
> start processing live changes (a.k.a. *changelog* mode). Maybe we
> could generate the metadata but not replicate anything, but then, if
> we're fully sure data is in sync after a session restart.

What's the hard way then? ;) I just thought it was odd given that it
takes rsync less than a minute to determine that they are in sync. It
would be a very useful feature to have. We have three data centres
spread across the country and sometimes we copy the data locally before
taking the disks out into the field to be placed in other machines.

Regards,
James



More information about the Gluster-users mailing list