[Gluster-users] Geo-Rep. 3.5.3, Missing Files, Incorrect "Files Pending"

Tue May 5 02:31:26 UTC 2015

Since we see all status are good either Active or Passive(No Faulty), 
hoping that everything is in sync(Except the wrong number in status output)

find . | wc -l  in both Master and Slave mount should help in deciding 
the number of files in sync.

In master nodes, look for log messages. Let us know if you feel any 
issue in log messages. (/var/log/glusterfs/geo-replication/)
In Slave nodes look in /var/log/glusterfs/geo-replication-slaves directory

I see, workers are still in Hybrid Crawl state. Please provide the 
output of
gluster volume geo-replication <MASTER> <SLAVEHOST>::<SLAVEVOL> config 
change_detector

Ideally, after initial crawl geo-rep should switch to Changelog crawl.

Why it is hard to show exact number of files in Sync?
--------------------------------------------------------------------------
Geo-rep doesn't have persistent store of all path names and sync status. 
When geo-rep gets the list of files to be synced, it adds the number to 
the counter. But if the same files modified again the counter will be 
incremented again. Numbers in Status output will not match the number of 
files on disk.

In future we can enhance it by maintaining a db/persistent store to 
record this information. As of now this is the limitation.

--
regards
Aravinda

On 05/05/2015 07:49 AM, David Gibbons wrote:
> So I should do a compare out-of-band from Gluster and see what is
> actually in-sync vs out of sync? Is there any easy way just to start
> it over? I am assuming removing and re-adding geo-rep is the easiest
> way. Is that correct?
>
> Thanks,
> Dave
>
> On Mon, May 4, 2015 at 10:09 PM, Aravinda <avishwan at redhat.com> wrote:
>> Status output has issue showing exact number of files in sync. Please check
>> the numbers on disk and let us know if difference exists between Master and
>> Secondary Volume.
>>
>> --
>> regards
>> Aravinda
>>
>> On 05/05/2015 06:58 AM, David Gibbons wrote:
>>> I am having an issue with geo-replication. There were a number of
>>> complications when I upgraded to 3.5.3, but geo-replication was (I
>>> think) working at some point. The volume is accessed via samba using
>>> vfs_glusterfs.
>>>
>>> The main issue is that geo-replication has not been sending updated
>>> copies of old files to the replicated server. So in the scenario where
>>> file created -> time passes -> file is modified -> file is saved, the
>>> new version is not replicated.
>>>
>>> Is it possible that one brick is having a geo-rep issue and the others
>>> are not? Consider this output:
>>>
>>> MASTER NODE                     MASTER VOL    MASTER BRICK
>>>           SLAVE                   STATUS     CHECKPOINT STATUS    CRAWL
>>> STATUS    FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES
>>> PENDING    FILES SKIPPED
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> gfs-a-1                         shares
>>> /mnt/a-1-shares-brick-1/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2309456        0                0
>>>                  0                  0
>>> gfs-a-1                         shares
>>> /mnt/a-1-shares-brick-2/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2315557        0                0
>>>                  0                  0
>>> gfs-a-1                         shares
>>> /mnt/a-1-shares-brick-3/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2362884        0                0
>>>                  0                  0
>>> gfs-a-1                         shares
>>> /mnt/a-1-shares-brick-4/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2407600        0                0
>>>                  0                  0
>>> gfs-a-2                         shares
>>> /mnt/a-2-shares-brick-1/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2409430        0                0
>>>                  0                  0
>>> gfs-a-2                         shares
>>> /mnt/a-2-shares-brick-2/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2308969        0                0
>>>                  0                  0
>>> gfs-a-2                         shares
>>> /mnt/a-2-shares-brick-3/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2079576        8191             0
>>>                  0                  0
>>> gfs-a-2                         shares
>>> /mnt/a-2-shares-brick-4/brick    gfs-a-bkp::bkpshares    Active
>>> N/A                  Hybrid Crawl    2340597        0                0
>>>                  0                  0
>>> gfs-a-3                         shares
>>> /mnt/a-3-shares-brick-1/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>> gfs-a-3                         shares
>>> /mnt/a-3-shares-brick-2/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>> gfs-a-3                         shares
>>> /mnt/a-3-shares-brick-3/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>> gfs-a-3                         shares
>>> /mnt/a-3-shares-brick-4/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>> gfs-a-4                         shares
>>> /mnt/a-4-shares-brick-1/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>> gfs-a-4                         shares
>>> /mnt/a-4-shares-brick-2/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>> gfs-a-4                         shares
>>> /mnt/a-4-shares-brick-3/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>> gfs-a-4                         shares
>>> /mnt/a-4-shares-brick-4/brick    gfs-a-bkp::bkpshares    Passive
>>> N/A                  N/A             0              0                0
>>>                  0                  0
>>>
>>> This seems to show that there are 8191 files_pending on just one
>>> brick, and the others are up to date. I am suspicious of the 8191
>>> number because it's looks like we're at a bucket-size boundary on the
>>> backend. I've tried stopping and re-starting the rep session. I've
>>> also tried changing the change_detector from xsync to changelog.
>>> Neither seems to have had an effect.
>>>
>>> It seems like geo-replication is quite wonky in 3.5.x. Is there light
>>> at the end of the tunnel, or should I find another solution to
>>> replicate?
>>>
>>> Cheers,
>>> Dave
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>