[Gluster-users] Geo-replication: Entry not present on master. Fixing gfid mismatch in slave

David Cunningham dcunningham at voisonics.com
Fri May 29 22:10:56 UTC 2020


Hello,

We're having an issue with a geo-replication process with unusually high
CPU use and giving "Entry not present on master. Fixing gfid mismatch in
slave" errors. Can anyone help on this?

We have 3 GlusterFS replica nodes (we'll call the master), which also push
data to a remote server (slave) using geo-replication. This has been
running fine for a couple of months, but yesterday one of the master nodes
started having unusually high CPU use. It's this process:

root at cafs30:/var/log/glusterfs# ps aux | grep 32048
root     32048 68.7  0.6 1843140 845756 ?      Rl   02:51 493:51 python2
/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py worker
gvol0 nvfs10::gvol0 --feedback-fd 15 --local-path
/nodirectwritedata/gluster/gvol0 --local-node cafs30 --local-node-id
b7521445-ee93-4fed-8ced-6a609fa8c7d4 --slave-id
cdcdb210-839c-4306-a4dc-e696b165ed17 --rpc-fd 12,11,9,13 --subvol-num 1
--resource-remote nvfs30 --resource-remote-id
1e698ccd-aeec-4ec4-96fe-383da8fc3b78

Here's what is being logged in
/var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log:

[2020-05-29 21:57:18.843524] I [master(worker
/nodirectwritedata/gluster/gvol0):1470:crawl] _GMaster: slave's time
 stime=(1590789408, 0)
[2020-05-29 21:57:30.626172] I [master(worker
/nodirectwritedata/gluster/gvol0):813:fix_possible_entry_failures]
_GMaster: Entry not present on master. Fixing gfid mismatch in slave.
Deleting the entry    retry_count=1   entry=({u'uid': 108, u'gfid':
u'7c0b75e5-d8b7-454f-8010-112d613c599e', u'gid': 117, u'mode': 33204,
u'entry': u'.gfid/c5422396-1578-4b50-a29d-315be2a9c5d8/00a859f7xxxx.cfg',
u'op': u'CREATE'}, 17, {u'slave_isdir': False, u'gfid_mismatch': True,
u'slave_name': None, u'slave_gfid':
u'ec4b0ace-2ec4-4ea5-adbc-9f519b81917c', u'name_mismatch': False, u'dst':
False})
[2020-05-29 21:57:30.627893] I [master(worker
/nodirectwritedata/gluster/gvol0):813:fix_possible_entry_failures]
_GMaster: Entry not present on master. Fixing gfid mismatch in slave.
Deleting the entry    retry_count=1   entry=({u'uid': 108, u'gfid':
u'a4d52e40-2e2f-4885-be5f-65fe95a8ebd7', u'gid': 117, u'mode': 33204,
u'entry':
u'.gfid/f857c42e-22f1-4ce4-8f2e-13bdadedde45/polycom_00a859f7xxxx.cfg',
u'op': u'CREATE'}, 17, {u'slave_isdir': False, u'gfid_mismatch': True,
u'slave_name': None, u'slave_gfid':
u'ece8da77-b5ea-45a7-9af7-7d4d8f55f74a', u'name_mismatch': False, u'dst':
False})
[2020-05-29 21:57:30.629532] I [master(worker
/nodirectwritedata/gluster/gvol0):813:fix_possible_entry_failures]
_GMaster: Entry not present on master. Fixing gfid mismatch in slave.
Deleting the entry    retry_count=1   entry=({u'uid': 108, u'gfid':
u'3c525ad8-aeb2-46b6-9c41-7fb4987916f8', u'gid': 117, u'mode': 33204,
u'entry':
u'.gfid/f857c42e-22f1-4ce4-8f2e-13bdadedde45/00a859f7xxxx-directory.xml',
u'op': u'CREATE'}, 17, {u'slave_isdir': False, u'gfid_mismatch': True,
u'slave_name': None, u'slave_gfid':
u'06717b5a-d842-495d-bd25-aab9cd454490', u'name_mismatch': False, u'dst':
False})
[2020-05-29 21:57:30.659123] I [master(worker
/nodirectwritedata/gluster/gvol0):942:handle_entry_failures] _GMaster:
Sucessfully fixed entry ops with gfid mismatch     retry_count=1
[2020-05-29 21:57:30.659343] I [master(worker
/nodirectwritedata/gluster/gvol0):1194:process_change] _GMaster: Retry
original entries. count = 1
[2020-05-29 21:57:30.725810] I [master(worker
/nodirectwritedata/gluster/gvol0):1197:process_change] _GMaster:
Sucessfully fixed all entry ops with gfid mismatch
[2020-05-29 21:57:31.747319] I [master(worker
/nodirectwritedata/gluster/gvol0):1954:syncjob] Syncer: Sync Time Taken
duration=0.7409 num_files=18    job=1   return_code=0

We've verified that the files like polycom_00a859f7xxxx.cfg referred to in
the error do exist on the master nodes and slave.

We found this bug fix:
https://bugzilla.redhat.com/show_bug.cgi?id=1642865

However that fix went in 5.1, and we're running 5.12 on the master nodes
and slave. A couple of GlusterFS clients connected to the master nodes are
running 5.13.

Would anyone have any suggestions? Thank you in advance.

-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200530/bbef4baf/attachment.html>


More information about the Gluster-users mailing list