[Gluster-users] Geo-Rep. 3.5.3, Missing Files, Incorrect "Files Pending"

Tue May 5 01:28:13 UTC 2015

I am having an issue with geo-replication. There were a number of
complications when I upgraded to 3.5.3, but geo-replication was (I
think) working at some point. The volume is accessed via samba using
vfs_glusterfs.

The main issue is that geo-replication has not been sending updated
copies of old files to the replicated server. So in the scenario where
file created -> time passes -> file is modified -> file is saved, the
new version is not replicated.

Is it possible that one brick is having a geo-rep issue and the others
are not? Consider this output:

MASTER NODE                     MASTER VOL    MASTER BRICK
        SLAVE                   STATUS     CHECKPOINT STATUS    CRAWL
STATUS    FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES
PENDING    FILES SKIPPED
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
gfs-a-1                         shares
/mnt/a-1-shares-brick-1/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2309456        0                0
               0                  0
gfs-a-1                         shares
/mnt/a-1-shares-brick-2/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2315557        0                0
               0                  0
gfs-a-1                         shares
/mnt/a-1-shares-brick-3/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2362884        0                0
               0                  0
gfs-a-1                         shares
/mnt/a-1-shares-brick-4/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2407600        0                0
               0                  0
gfs-a-2                         shares
/mnt/a-2-shares-brick-1/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2409430        0                0
               0                  0
gfs-a-2                         shares
/mnt/a-2-shares-brick-2/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2308969        0                0
               0                  0
gfs-a-2                         shares
/mnt/a-2-shares-brick-3/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2079576        8191             0
               0                  0
gfs-a-2                         shares
/mnt/a-2-shares-brick-4/brick    gfs-a-bkp::bkpshares    Active
N/A                  Hybrid Crawl    2340597        0                0
               0                  0
gfs-a-3                         shares
/mnt/a-3-shares-brick-1/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0
gfs-a-3                         shares
/mnt/a-3-shares-brick-2/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0
gfs-a-3                         shares
/mnt/a-3-shares-brick-3/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0
gfs-a-3                         shares
/mnt/a-3-shares-brick-4/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0
gfs-a-4                         shares
/mnt/a-4-shares-brick-1/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0
gfs-a-4                         shares
/mnt/a-4-shares-brick-2/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0
gfs-a-4                         shares
/mnt/a-4-shares-brick-3/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0
gfs-a-4                         shares
/mnt/a-4-shares-brick-4/brick    gfs-a-bkp::bkpshares    Passive
N/A                  N/A             0              0                0
               0                  0

This seems to show that there are 8191 files_pending on just one
brick, and the others are up to date. I am suspicious of the 8191
number because it's looks like we're at a bucket-size boundary on the
backend. I've tried stopping and re-starting the rep session. I've
also tried changing the change_detector from xsync to changelog.
Neither seems to have had an effect.

It seems like geo-replication is quite wonky in 3.5.x. Is there light
at the end of the tunnel, or should I find another solution to
replicate?

Cheers,
Dave