[Gluster-users] Geo-replication keeps failing.

Alvin Starr alvin at netvel.net
Thu Sep 13 11:36:34 UTC 2018


We are running glusterfs-3.8.9-1.el7.x86_64 with geo-replication.

I have been having ongoing problems with the replication failing after 
some time.

Once it has failed restarting it results in the attached logfile snippet.


-- 
Alvin Starr                   ||   land:  (905)513-7688
Netvel Inc.                   ||   Cell:  (416)806-0133
alvin at netvel.net              ||

-------------- next part --------------
[2018-09-12 03:01:04.433048] I [monitor(monitor):267:monitor] Monitor: ------------------------------------------------------------
[2018-09-12 03:01:04.433470] I [monitor(monitor):268:monitor] Monitor: starting gsyncd worker
[2018-09-12 03:01:04.599227] D [gsyncd(agent):730:main_i] <top>: rpc_fd: '9,12,11,10'
[2018-09-12 03:01:04.600925] I [changelogagent(agent):73:__init__] ChangelogAgent: Agent listining...
[2018-09-12 03:01:04.625732] I [gsyncd(/bricks/ccto_us/data):736:main_i] <top>: syncing: gluster://localhost:CCTO-US-EDOCS -> ssh://root@archive2.vpn.sycle.net:gluster://localhost:arch-CCTO-US-EDOCS
[2018-09-12 03:01:04.675003] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__() ...
[2018-09-12 03:01:06.518789] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__ -> 1.0
[2018-09-12 03:01:06.519186] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721266.52 version() ...
[2018-09-12 03:01:06.522499] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721266.52 version -> 1.0
[2018-09-12 03:01:06.522882] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721266.52 pid() ...
[2018-09-12 03:01:06.525834] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721266.52 pid -> 2647
[2018-09-12 03:01:06.623212] D [resource(/bricks/ccto_us/data):1281:inhibit] DirectMounter: auxiliary glusterfs mount in place
[2018-09-12 03:01:07.678328] D [resource(/bricks/ccto_us/data):1336:inhibit] DirectMounter: auxiliary glusterfs mount prepared
[2018-09-12 03:01:07.679094] I [master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up xsync change detection mode
[2018-09-12 03:01:07.679126] D [monitor(monitor):337:monitor] Monitor: worker(/bricks/ccto_us/data) connected
[2018-09-12 03:01:07.679547] I [master(/bricks/ccto_us/data):367:__init__] _GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.681130] I [master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up changelog change detection mode
[2018-09-12 03:01:07.681557] I [master(/bricks/ccto_us/data):367:__init__] _GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.683561] I [master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up changeloghistory change detection mode
[2018-09-12 03:01:07.683960] I [master(/bricks/ccto_us/data):367:__init__] _GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.688644] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721267.69 version() ...
[2018-09-12 03:01:07.689547] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721267.69 version -> 1.0
[2018-09-12 03:01:07.689709] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:07.689863] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721267.69 init() ...
[2018-09-12 03:01:07.706136] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721267.69 init -> None
[2018-09-12 03:01:07.706440] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721267.71 register('/bricks/ccto_us/data', '/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6', '/var/log/glusterfs/geo-replication/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS.%2Fbricks%2Fccto_us%2Fdata-changes.log', 7, 5) ...
[2018-09-12 03:01:09.711715] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721267.71 register -> None
[2018-09-12 03:01:09.712357] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.712651] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.712901] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.713129] I [master(/bricks/ccto_us/data):1251:register] _GMaster: xsync temp directory: /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6/xsync
[2018-09-12 03:01:09.713479] I [resource(/bricks/ccto_us/data):1533:service_loop] GLUSTER: Register time: 1536721269
[2018-09-12 03:01:09.714504] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139691772856064:1536721269.71 keep_alive(None,) ...
[2018-09-12 03:01:09.719439] I [master(/bricks/ccto_us/data):510:crawlwrap] _GMaster: primary master with volume id 900656fd-3f13-4ba2-bf04-90832508566e ...
[2018-09-12 03:01:09.723702] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139691772856064:1536721269.71 keep_alive -> 1
[2018-09-12 03:01:09.726247] I [master(/bricks/ccto_us/data):519:crawlwrap] _GMaster: crawl interval: 1 seconds
[2018-09-12 03:01:09.733443] I [master(/bricks/ccto_us/data):1165:crawl] _GMaster: starting history crawl... turns: 1, stime: (1536718883, 0), etime: 1536721269
[2018-09-12 03:01:09.733824] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721269.73 history('/bricks/ccto_us/data/.glusterfs/changelogs', 1536718883, 1536721269, 3) ...
[2018-09-12 03:01:09.735060] E [repce(agent):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 54, in history
    num_parallel)
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 100, in cl_history_changelog
    cls.raise_changelog_err()
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 27, in raise_changelog_err
    raise ChangelogException(errn, os.strerror(errn))
ChangelogException: [Errno 2] No such file or directory
[2018-09-12 03:01:09.736625] E [repce(/bricks/ccto_us/data):207:__call__] RepceClient: call 26412:139692706621248:1536721269.73 (history) failed on peer with ChangelogException
[2018-09-12 03:01:09.736931] E [resource(/bricks/ccto_us/data):1551:service_loop] GLUSTER: Changelog History Crawl failed, [Errno 2] No such file or directory
[2018-09-12 03:01:09.737512] I [syncdutils(/bricks/ccto_us/data):220:finalize] <top>: exiting.
[2018-09-12 03:01:09.743961] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.
[2018-09-12 03:01:09.744335] I [syncdutils(agent):220:finalize] <top>: exiting.
[2018-09-12 03:01:10.682538] I [monitor(monitor):344:monitor] Monitor: worker(/bricks/ccto_us/data) died in startup phase


More information about the Gluster-users mailing list