<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Hi all,</div><div><br></div><div>
I am testing a gluster geo-replication setup in glusterfs 3.12.14
version on CentOS Linux release 7.5.1804 and getting a faulty session
due to rsync. It returns error 3. <br></div><div><br></div><div>After I start the session, it goes from initializing, then to active and finally to faulty.</div><div>Here is what I can see in logs.</div><div><br></div><div>cat /var/log/glusterfs/geo-replication/mastervol/ssh%3A%2F%2Fgeoaccount%4010.0.2.13%3Agluster%3A%2F%2F127.0.0.1%3Aslavevol.log<br><br>[2018-10-06 08:55:02.246958] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker brick=/bricks/brick-a1/brick slave_node=ssh://geoaccount@servere:gluster://localhost:slavevol<br>[2018-10-06 08:55:02.503489] I [resource(/bricks/brick-a1/brick):1780:connect_remote] SSH: Initializing SSH connection between master and slave...<br>[2018-10-06 08:55:02.515492] I [changelogagent(/bricks/brick-a1/brick):73:__init__] ChangelogAgent: Agent listining...<br>[2018-10-06 08:55:04.571449] I [resource(/bricks/brick-a1/brick):1787:connect_remote] SSH: SSH connection between master and slave established. duration=2.0676<br>[2018-10-06 08:55:04.571890] I [resource(/bricks/brick-a1/brick):1502:connect] GLUSTER: Mounting gluster volume locally...<br>[2018-10-06 08:55:05.693440] I [resource(/bricks/brick-a1/brick):1515:connect] GLUSTER: Mounted gluster volume duration=1.1212<br>[2018-10-06 08:55:05.693741] I [gsyncd(/bricks/brick-a1/brick):799:main_i] <top>: Closing feedback fd, waking up the monitor<br>[2018-10-06 08:55:07.711970] I [master(/bricks/brick-a1/brick):1518:register] _GMaster: Working dir path=/var/lib/misc/glusterfsd/mastervol/ssh%3A%2F%2Fgeoaccount%4010.0.2.13%3Agluster%3A%2F%2F127.0.0.1%3Aslavevol/9517ac67e25c7491f03ba5e2506505bd<br>[2018-10-06 08:55:07.712357] I [resource(/bricks/brick-a1/brick):1662:service_loop] GLUSTER: Register time time=1538816107<br>[2018-10-06 08:55:07.764151] I [master(/bricks/brick-a1/brick):490:mgmt_lock] _GMaster: Got lock Becoming ACTIVE brick=/bricks/brick-a1/brick<br>[2018-10-06 08:55:07.768949] I [gsyncdstatus(/bricks/brick-a1/brick):276:set_active] GeorepStatus: Worker Status Change status=Active<br>[2018-10-06 08:55:07.770529] I [gsyncdstatus(/bricks/brick-a1/brick):248:set_worker_crawl_status] GeorepStatus: Crawl Status Changestatus=History Crawl<br>[2018-10-06 08:55:07.770975] I [master(/bricks/brick-a1/brick):1432:crawl] _GMaster: starting history crawl turns=1 stime=(1538745843, 0) entry_stime=None etime=1538816107<br>[2018-10-06 08:55:08.773402] I [master(/bricks/brick-a1/brick):1461:crawl] _GMaster: slave's time stime=(1538745843, 0)<br>[2018-10-06 08:55:09.262964] I [master(/bricks/brick-a1/brick):1863:syncjob] Syncer: Sync Time Taken duration=0.0606 num_files=1job=2 return_code=3<br>[2018-10-06 08:55:09.263253] E [resource(/bricks/brick-a1/brick):210:errlog] Popen: command returned error cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls --ignore-missing-args . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-wVbxGU/05b8d7b5dab75575689c0e1a2ec33b3f.sock --compress geoaccount@servere:/proc/12335/cwd error=3<br>[2018-10-06 08:55:09.275593] I [syncdutils(/bricks/brick-a1/brick):271:finalize] <top>: exiting.<br>[2018-10-06 08:55:09.279442] I [repce(/bricks/brick-a1/brick):92:service_loop] RepceServer: terminating on reaching EOF.<br>[2018-10-06 08:55:09.279936] I [syncdutils(/bricks/brick-a1/brick):271:finalize] <top>: exiting.<br>[2018-10-06 08:55:09.698153] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase brick=/bricks/brick-a1/brick<br>[2018-10-06 08:55:09.707330] I [gsyncdstatus(monitor):243:set_worker_status] GeorepStatus: Worker Status Change status=Faulty<br>[2018-10-06 08:55:19.888017] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker brick=/bricks/brick-a1/brick slave_node=ssh://geoaccount@servere:gluster://localhost:slavevol<br>[2018-10-06 08:55:20.140819] I [resource(/bricks/brick-a1/brick):1780:connect_remote] SSH: Initializing SSH connection between master and slave...<br>[2018-10-06 08:55:20.141815] I [changelogagent(/bricks/brick-a1/brick):73:__init__] ChangelogAgent: Agent listining...<br>[2018-10-06 08:55:22.245625] I [resource(/bricks/brick-a1/brick):1787:connect_remote] SSH: SSH connection between master and slave established. duration=2.1046<br>[2018-10-06 08:55:22.246062] I [resource(/bricks/brick-a1/brick):1502:connect] GLUSTER: Mounting gluster volume locally...<br>[2018-10-06 08:55:23.370100] I [resource(/bricks/brick-a1/brick):1515:connect] GLUSTER: Mounted gluster volume duration=1.1238<br>[2018-10-06 08:55:23.370507] I [gsyncd(/bricks/brick-a1/brick):799:main_i] <top>: Closing feedback fd, waking up the monitor<br>[2018-10-06 08:55:25.388721] I [master(/bricks/brick-a1/brick):1518:register] _GMaster: Working dir path=/var/lib/misc/glusterfsd/mastervol/ssh%3A%2F%2Fgeoaccount%4010.0.2.13%3Agluster%3A%2F%2F127.0.0.1%3Aslavevol/9517ac67e25c7491f03ba5e2506505bd<br>[2018-10-06 08:55:25.388978] I [resource(/bricks/brick-a1/brick):1662:service_loop] GLUSTER: Register time time=1538816125<br>[2018-10-06 08:55:25.405546] I [master(/bricks/brick-a1/brick):490:mgmt_lock] _GMaster: Got lock Becoming ACTIVE brick=/bricks/brick-a1/brick<br>[2018-10-06 08:55:25.408958] I [gsyncdstatus(/bricks/brick-a1/brick):276:set_active] GeorepStatus: Worker Status Change status=Active<br>[2018-10-06 08:55:25.410522] I [gsyncdstatus(/bricks/brick-a1/brick):248:set_worker_crawl_status] GeorepStatus: Crawl Status Changestatus=History Crawl<br>[2018-10-06 08:55:25.411005] I [master(/bricks/brick-a1/brick):1432:crawl] _GMaster: starting history crawl turns=1 stime=(1538745843, 0) entry_stime=None etime=1538816125<br>[2018-10-06 08:55:26.413892] I [master(/bricks/brick-a1/brick):1461:crawl] _GMaster: slave's time stime=(1538745843, 0)<br>[2018-10-06 08:55:26.933149] I [master(/bricks/brick-a1/brick):1863:syncjob] Syncer: Sync Time Taken duration=0.0549 num_files=1job=3 return_code=3<br>[2018-10-06 08:55:26.933419] E [resource(/bricks/brick-a1/brick):210:errlog] Popen: command returned error cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls --ignore-missing-args . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-Oq_aPL/05b8d7b5dab75575689c0e1a2ec33b3f.sock --compress geoaccount@servere:/proc/12489/cwd error=3<br>[2018-10-06 08:55:26.953044] I [syncdutils(/bricks/brick-a1/brick):271:finalize] <top>: exiting.<br>[2018-10-06 08:55:26.956691] I [repce(/bricks/brick-a1/brick):92:service_loop] RepceServer: terminating on reaching EOF.<br>[2018-10-06 08:55:26.957233] I [syncdutils(/bricks/brick-a1/brick):271:finalize] <top>: exiting.<br>[2018-10-06 08:55:27.378103] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase brick=/bricks/brick-a1/brick<br>[2018-10-06 08:55:27.382554] I [gsyncdstatus(monitor):243:set_worker_status] GeorepStatus: Worker Status Change status=Faulty<br>[root@servera ~]#<br></div><div><br></div><div><br></div><div><br>[root@servera ~]# gluster volume info mastervol<br><br>Volume Name: mastervol<br>Type: Replicate<br>Volume ID: b7ec0647-b101-4240-9abf-32f24f2decec<br>Status: Started<br>Snapshot Count: 0<br>Number of Bricks: 1 x 2 = 2<br>Transport-type: tcp<br>Bricks:<br>Brick1: servera:/bricks/brick-a1/brick<br>Brick2: serverb:/bricks/brick-b1/brick<br>Options Reconfigured:<br>performance.client-io-threads: off<br>nfs.disable: on<br>transport.address-family: inet<br>geo-replication.indexing: on<br>geo-replication.ignore-pid-check: on<br>changelog.changelog: on<br>cluster.enable-shared-storage: enable<br></div><div><br></div><div>[root@servere ~]# gluster volume info slavevol<br><br>Volume Name: slavevol<br>Type: Replicate<br>Volume ID: 8b431b4e-5dc4-4db6-9608-3b82cce5024c<br>Status: Started<br>Snapshot Count: 0<br>Number of Bricks: 1 x 2 = 2<br>Transport-type: tcp<br>Bricks:<br>Brick1: servere:/bricks/brick-e1/brick<br>Brick2: servere:/bricks/brick-e2/brick<br>Options Reconfigured:<br>features.read-only: off<br>performance.client-io-threads: off<br>nfs.disable: on<br>transport.address-family: inet<br>performance.quick-read: off<br></div><div><br></div><div>Do you have any idea how can I solve this?</div><div><br></div><div>Many thanks!<br></div></div></div></div></div></div>