[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
Matthew Benstead
matthewb at uvic.ca
Thu Oct 1 21:08:04 UTC 2020
Hello,
I'm looking for some help with a GeoReplication Error in my Gluster
7/CentOS 7 setup. Replication progress has basically stopped, and the
status of the replication keeps switching.
The gsyncd log has errors like "Operation not permitted", "incomplete
sync", etc... help? I'm not sure how to proceed in troubleshooting this.
The log is here, it basically just repeats - from:
/var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log
[2020-10-01 20:52:15.291923] I [master(worker
/data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken
duration=32.8466 num_files=1749 job=3 return_code=23
[2020-10-01 20:52:18.700062] I [master(worker
/data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken
duration=43.1210 num_files=3167 job=6 return_code=23
[2020-10-01 20:52:23.383234] W [master(worker
/data/storage_c/storage):1393:process] _GMaster: incomplete sync,
retrying changelogs files=['XSYNC-CHANGELOG.1601585397']
[2020-10-01 20:52:28.537657] E [repce(worker
/data/storage_b/storage):213:__call__] RepceClient: call failed
call=258187:140538843596608:1601585515.63 method=entry_ops
error=OSError
[2020-10-01 20:52:28.538064] E [syncdutils(worker
/data/storage_b/storage):339:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 332,
in main
func(args)
File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 86,
in subcmd_worker
local.service_loop(remote)
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
1308, in service_loop
g1.crawlwrap(oneshot=True, register_time=register_time)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 602,
in crawlwrap
self.crawl()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1682,
in crawl
self.process([item[1]], 0)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1327,
in process
self.process_change(change, done, retry)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1221,
in process_change
failures = self.slave.server.entry_ops(entries)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 232, in
__call__
return self.ins(self.meth, *a)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 214, in
__call__
raise res
OSError: [Errno 1] Operation not permitted
[2020-10-01 20:52:28.570316] I [repce(agent
/data/storage_b/storage):96:service_loop] RepceServer: terminating on
reaching EOF.
[2020-10-01 20:52:28.613603] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker
Status Change status=Faulty
[2020-10-01 20:52:29.619797] I [master(worker
/data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken
duration=5.6458 num_files=455 job=3 return_code=23
[2020-10-01 20:52:38.286245] I [master(worker
/data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken
duration=14.1824 num_files=1333 job=2 return_code=23
[2020-10-01 20:52:38.628156] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker
Status Change status=Initializing...
[2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] Monitor:
starting gsyncd worker brick=/data/storage_b/storage
slave_node=10.0.231.82
[2020-10-01 20:52:38.684736] I [gsyncd(agent
/data/storage_b/storage):318:main] <top>: Using session config
file
path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2020-10-01 20:52:38.687213] I [gsyncd(worker
/data/storage_b/storage):318:main] <top>: Using session config
file
path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2020-10-01 20:52:38.687401] I [changelogagent(agent
/data/storage_b/storage):72:__init__] ChangelogAgent: Agent listining...
[2020-10-01 20:52:38.703295] I [resource(worker
/data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH
connection between master and slave...
[2020-10-01 20:52:40.388372] I [resource(worker
/data/storage_b/storage):1435:connect_remote] SSH: SSH connection
between master and slave established. duration=1.6849
[2020-10-01 20:52:40.388582] I [resource(worker
/data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster volume
locally...
[2020-10-01 20:52:41.501105] I [resource(worker
/data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster volume
duration=1.1123
[2020-10-01 20:52:41.501405] I [subcmds(worker
/data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn
successful. Acknowledging back to monitor
[2020-10-01 20:52:43.531146] I [master(worker
/data/storage_b/storage):1640:register] _GMaster: Working dir
path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage
[2020-10-01 20:52:43.533953] I [resource(worker
/data/storage_b/storage):1291:service_loop] GLUSTER: Register time
time=1601585563
[2020-10-01 20:52:43.547092] I [gsyncdstatus(worker
/data/storage_b/storage):281:set_active] GeorepStatus: Worker Status
Change status=Active
[2020-10-01 20:52:43.561920] I [gsyncdstatus(worker
/data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus:
Crawl Status Change status=History Crawl
[2020-10-01 20:52:43.562184] I [master(worker
/data/storage_b/storage):1554:crawl] _GMaster: starting history
crawl turns=1 stime=None entry_stime=None etime=1601585563
[2020-10-01 20:52:43.562269] I [resource(worker
/data/storage_b/storage):1307:service_loop] GLUSTER: No stime available,
using xsync crawl
[2020-10-01 20:52:43.569799] I [master(worker
/data/storage_b/storage):1670:crawl] _GMaster: starting hybrid
crawl stime=None
[2020-10-01 20:52:43.573528] I [gsyncdstatus(worker
/data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus:
Crawl Status Change status=Hybrid Crawl
[2020-10-01 20:52:44.370985] I [master(worker
/data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken
duration=20.4307 num_files=2609 job=5 return_code=23
[2020-10-01 20:52:49.431854] W [master(worker
/data/storage_c/storage):1393:process] _GMaster: incomplete sync,
retrying changelogs files=['XSYNC-CHANGELOG.1601585397']
[2020-10-01 20:52:54.801500] I [master(worker
/data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken
duration=72.7492 num_files=4227 job=2 return_code=23
[2020-10-01 20:52:56.766547] I [master(worker
/data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken
duration=74.3569 num_files=4674 job=5 return_code=23
[2020-10-01 20:53:18.853333] I [master(worker
/data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken
duration=28.7125 num_files=4397 job=3 return_code=23
[2020-10-01 20:53:21.224921] W [master(worker
/data/storage_a/storage):1393:process] _GMaster: incomplete sync,
retrying changelogs files=['CHANGELOG.1601044033',
'CHANGELOG.1601044048', 'CHANGELOG.1601044063', 'CHANGELOG.1601044078',
'CHANGELOG.1601044093', 'CHANGELOG.1601044108', 'CHANGELOG.1601044123']
[2020-10-01 20:53:22.134536] I [master(worker
/data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken
duration=0.2159 num_files=3 job=3 return_code=23
[2020-10-01 20:53:25.615712] I [master(worker
/data/storage_b/storage):1681:crawl] _GMaster: processing xsync
changelog
path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563
[2020-10-01 20:53:25.634970] W [master(worker
/data/storage_c/storage):1393:process] _GMaster: incomplete sync,
retrying changelogs files=['XSYNC-CHANGELOG.1601585397']
GeoReplication status - see it change from Active to Faulty:
[root at storage01 ~]# gluster volume geo-replication status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER
SLAVE SLAVE NODE STATUS
CRAWL STATUS LAST_SYNCED
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.91 storage /data/storage_a/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Changelog Crawl 2020-09-25 07:26:57
10.0.231.91 storage /data/storage_c/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.82 Active
Hybrid Crawl N/A
10.0.231.91 storage /data/storage_b/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.82 Active
Hybrid Crawl N/A
10.0.231.92 storage /data/storage_b/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.82 Active
History Crawl 2020-09-23 01:56:05
10.0.231.92 storage /data/storage_a/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.82 Active
Hybrid Crawl N/A
10.0.231.92 storage /data/storage_c/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Hybrid Crawl N/A
10.0.231.93 storage /data/storage_c/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Changelog Crawl 2020-09-25 06:55:57
10.0.231.93 storage /data/storage_b/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Hybrid Crawl N/A
10.0.231.93 storage /data/storage_a/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Hybrid Crawl N/A
[root at storage01 ~]# gluster volume geo-replication status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER
SLAVE SLAVE NODE STATUS
CRAWL STATUS LAST_SYNCED
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.91 storage /data/storage_a/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Changelog Crawl 2020-09-25 07:26:57
10.0.231.91 storage /data/storage_c/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.82 Active
Hybrid Crawl N/A
10.0.231.91 storage /data/storage_b/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup N/A Faulty
N/A N/A
10.0.231.92 storage /data/storage_b/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.82 Active
History Crawl 2020-09-23 01:58:05
10.0.231.92 storage /data/storage_a/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.82 Active
Hybrid Crawl N/A
10.0.231.92 storage /data/storage_c/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup N/A Faulty
N/A N/A
10.0.231.93 storage /data/storage_c/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Changelog Crawl 2020-09-25 06:58:56
10.0.231.93 storage /data/storage_b/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup 10.0.231.81 Active
Hybrid Crawl N/A
10.0.231.93 storage /data/storage_a/storage geoaccount
ssh://geoaccount@10.0.231.81::pcic-backup N/A Faulty
N/A N/A
Cluster information: (Note - disabled performance.open-behind to work
around https://github.com/gluster/glusterfs/issues/1440 )
[root at storage01 ~]# gluster --version | head -1; cat
/etc/centos-release; uname -r
glusterfs 7.7
CentOS Linux release 7.8.2003 (Core)
3.10.0-1127.10.1.el7.x86_64
[root at storage01 ~]# df -h /storage2/
Filesystem Size Used Avail Use% Mounted on
10.0.231.91:/storage 328T 228T 100T 70% /storage2
[root at storage01 ~]# gluster volume info
Volume Name: storage
Type: Distributed-Replicate
Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x (2 + 1) = 9
Transport-type: tcp
Bricks:
Brick1: 10.0.231.91:/data/storage_a/storage
Brick2: 10.0.231.92:/data/storage_b/storage
Brick3: 10.0.231.93:/data/storage_c/storage (arbiter)
Brick4: 10.0.231.92:/data/storage_a/storage
Brick5: 10.0.231.93:/data/storage_b/storage
Brick6: 10.0.231.91:/data/storage_c/storage (arbiter)
Brick7: 10.0.231.93:/data/storage_a/storage
Brick8: 10.0.231.91:/data/storage_b/storage
Brick9: 10.0.231.92:/data/storage_c/storage (arbiter)
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
network.ping-timeout: 10
features.inode-quota: on
features.quota: on
nfs.disable: on
features.quota-deem-statfs: on
storage.fips-mode-rchecksum: on
performance.readdir-ahead: on
performance.parallel-readdir: on
cluster.lookup-optimize: on
client.event-threads: 4
server.event-threads: 4
performance.cache-size: 256MB
performance.open-behind: off
Thanks,
-Matthew
More information about the Gluster-users
mailing list