<div dir="ltr">I am trying to get up geo replication between two gluster volumes<div><br></div><div>I have set up two replica 2 arbiter 1 volumes with 9 bricks</div><div><br></div><div><div>[root@gfs1 ~]# gluster volume info</div><div> </div><div>Volume Name: gfsvol</div><div>Type: Distributed-Replicate</div><div>Volume ID: c2fb4365-480b-4d37-8c7d-c3046bca7306</div><div>Status: Started</div><div>Snapshot Count: 0</div><div>Number of Bricks: 3 x (2 + 1) = 9</div><div>Transport-type: tcp</div><div>Bricks:</div><div>Brick1: gfs2:/gfs/brick1/gv0</div><div>Brick2: gfs3:/gfs/brick1/gv0</div><div>Brick3: gfs1:/gfs/arbiter/gv0 (arbiter)</div><div>Brick4: gfs1:/gfs/brick1/gv0</div><div>Brick5: gfs3:/gfs/brick2/gv0</div><div>Brick6: gfs2:/gfs/arbiter/gv0 (arbiter)</div><div>Brick7: gfs1:/gfs/brick2/gv0</div><div>Brick8: gfs2:/gfs/brick2/gv0</div><div>Brick9: gfs3:/gfs/arbiter/gv0 (arbiter)</div><div>Options Reconfigured:</div><div>nfs.disable: on</div><div>transport.address-family: inet</div><div>geo-replication.indexing: on</div><div>geo-replication.ignore-pid-check: on</div><div>changelog.changelog: on</div></div><div><br></div><div><div>[root@gfs4 ~]# gluster volume info</div><div> </div><div>Volume Name: gfsvol_rep</div><div>Type: Distributed-Replicate</div><div>Volume ID: 42bfa062-ad0d-4242-a813-63389be1c404</div><div>Status: Started</div><div>Snapshot Count: 0</div><div>Number of Bricks: 3 x (2 + 1) = 9</div><div>Transport-type: tcp</div><div>Bricks:</div><div>Brick1: gfs5:/gfs/brick1/gv0</div><div>Brick2: gfs6:/gfs/brick1/gv0</div><div>Brick3: gfs4:/gfs/arbiter/gv0 (arbiter)</div><div>Brick4: gfs4:/gfs/brick1/gv0</div><div>Brick5: gfs6:/gfs/brick2/gv0</div><div>Brick6: gfs5:/gfs/arbiter/gv0 (arbiter)</div><div>Brick7: gfs4:/gfs/brick2/gv0</div><div>Brick8: gfs5:/gfs/brick2/gv0</div><div>Brick9: gfs6:/gfs/arbiter/gv0 (arbiter)</div><div>Options Reconfigured:</div><div>nfs.disable: on</div><div>transport.address-family: inet</div></div><div><br></div><div>I set up passwordless ssh login from all the master servers to all the slave servers then created and started the geo replicated volume</div><div><br></div><div>I check the status and they switch between being active with history crawl and faulty with n/a every few second</div><div><div><div>MASTER NODE    MASTER VOL    MASTER BRICK        SLAVE USER      SLAVE                            SLAVE NODE    STATUS    CRAWL STATUS     LAST_SYNCED                  </div><div>-------------------------------------------------------------------------------------------------------------------------------------------------------------</div><div>gfs1           gfsvol        /gfs/arbiter/gv0    geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div><div>gfs1           gfsvol        /gfs/brick1/gv0     geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    gfs6          Active    History Crawl    2017-09-28 23:30:19          </div><div>gfs1           gfsvol        /gfs/brick2/gv0     geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div><div>gfs3           gfsvol        /gfs/brick1/gv0     geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div><div>gfs3           gfsvol        /gfs/brick2/gv0     geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div><div>gfs3           gfsvol        /gfs/arbiter/gv0    geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div><div>gfs2           gfsvol        /gfs/brick1/gv0     geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div><div>gfs2           gfsvol        /gfs/arbiter/gv0    geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div><div>gfs2           gfsvol        /gfs/brick2/gv0     geo-rep-user    geo-rep-user@gfs4::gfsvol_rep    N/A           Faulty    N/A              N/A                          </div></div></div><div><br></div><div><br></div><div>Here is the output of the geo replication log file </div><div><div>[root@gfs1 ~]# tail -n 100 $(gluster volume geo-replication gfsvol geo-rep-user@gfs4::gfsvol_rep config log-file)</div><div>[2017-09-29 15:53:29.785386] I [master(/gfs/brick2/gv0):1860:syncjob] Syncer: Sync Time Taken<span style="white-space:pre">        </span>duration=0.0357<span style="white-space:pre">        </span>num_files=1<span style="white-space:pre">        </span>job=3<span style="white-space:pre">        </span>return_code=12</div><div>[2017-09-29 15:53:29.785615] E [resource(/gfs/brick2/gv0):208:errlog] Popen: command returned error<span style="white-space:pre">        </span>cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-fdyDHm/78cf8b204207154de59d7ac32eee737f.sock --compress geo-rep-user@gfs6:/proc/17554/cwd<span style="white-space:pre">        </span>error=12</div><div>[2017-09-29 15:53:29.797259] I [syncdutils(/gfs/brick2/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:29.799386] I [repce(/gfs/brick2/gv0):92:service_loop] RepceServer: terminating on reaching EOF.</div><div>[2017-09-29 15:53:29.799570] I [syncdutils(/gfs/brick2/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:30.105407] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker<span style="white-space:pre">        </span>brick=/gfs/brick1/gv0<span style="white-space:pre">        </span>slave_node=ssh://geo-rep-user@gfs6:gluster://localhost:gfsvol_rep</div><div>[2017-09-29 15:53:30.232007] I [resource(/gfs/brick1/gv0):1772:connect_remote] SSH: Initializing SSH connection between master and slave...</div><div>[2017-09-29 15:53:30.232738] I [changelogagent(/gfs/brick1/gv0):73:__init__] ChangelogAgent: Agent listining...</div><div>[2017-09-29 15:53:30.248094] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase<span style="white-space:pre">        </span>brick=/gfs/brick2/gv0</div><div>[2017-09-29 15:53:30.252793] I [gsyncdstatus(monitor):242:set_worker_status] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Faulty</div><div>[2017-09-29 15:53:30.742058] I [master(/gfs/arbiter/gv0):1515:register] _GMaster: Working dir<span style="white-space:pre">        </span>path=/var/lib/misc/glusterfsd/gfsvol/ssh%3A%2F%2Fgeo-rep-user%4010.1.1.104%3Agluster%3A%2F%2F127.0.0.1%3Agfsvol_rep/40efd54bad1d5828a1221dd560de376f</div><div>[2017-09-29 15:53:30.742360] I [resource(/gfs/arbiter/gv0):1654:service_loop] GLUSTER: Register time<span style="white-space:pre">        </span>time=1506700410</div><div>[2017-09-29 15:53:30.754738] I [gsyncdstatus(/gfs/arbiter/gv0):275:set_active] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Active</div><div>[2017-09-29 15:53:30.756040] I [gsyncdstatus(/gfs/arbiter/gv0):247:set_worker_crawl_status] GeorepStatus: Crawl Status Change<span style="white-space:pre">        </span>status=History Crawl</div><div>[2017-09-29 15:53:30.756280] I [master(/gfs/arbiter/gv0):1429:crawl] _GMaster: starting history crawl<span style="white-space:pre">        </span>turns=1<span style="white-space:pre">        </span>stime=(1506637819, 0)<span style="white-space:pre">        </span>entry_stime=None<span style="white-space:pre">        </span>etime=1506700410</div><div>[2017-09-29 15:53:31.758335] I [master(/gfs/arbiter/gv0):1458:crawl] _GMaster: slave&#39;s time<span style="white-space:pre">        </span>stime=(1506637819, 0)</div><div>[2017-09-29 15:53:31.939471] I [resource(/gfs/brick1/gv0):1779:connect_remote] SSH: SSH connection between master and slave established.<span style="white-space:pre">        </span>duration=1.7073</div><div>[2017-09-29 15:53:31.939665] I [resource(/gfs/brick1/gv0):1494:connect] GLUSTER: Mounting gluster volume locally...</div><div>[2017-09-29 15:53:32.284754] I [master(/gfs/arbiter/gv0):1860:syncjob] Syncer: Sync Time Taken<span style="white-space:pre">        </span>duration=0.0372<span style="white-space:pre">        </span>num_files=1<span style="white-space:pre">        </span>job=3<span style="white-space:pre">        </span>return_code=12</div><div>[2017-09-29 15:53:32.284996] E [resource(/gfs/arbiter/gv0):208:errlog] Popen: command returned error<span style="white-space:pre">        </span>cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-i_wIMu/5f1d38555e12d0018fb6ed1e6bd63023.sock --compress geo-rep-user@gfs5:/proc/8334/cwd<span style="white-space:pre">        </span>error=12</div><div>[2017-09-29 15:53:32.300786] I [syncdutils(/gfs/arbiter/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:32.303261] I [repce(/gfs/arbiter/gv0):92:service_loop] RepceServer: terminating on reaching EOF.</div><div>[2017-09-29 15:53:32.303452] I [syncdutils(/gfs/arbiter/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:32.732858] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase<span style="white-space:pre">        </span>brick=/gfs/arbiter/gv0</div><div>[2017-09-29 15:53:32.736538] I [gsyncdstatus(monitor):242:set_worker_status] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Faulty</div><div>[2017-09-29 15:53:33.35219] I [resource(/gfs/brick1/gv0):1507:connect] GLUSTER: Mounted gluster volume<span style="white-space:pre">        </span>duration=1.0954</div><div>[2017-09-29 15:53:33.35403] I [gsyncd(/gfs/brick1/gv0):799:main_i] &lt;top&gt;: Closing feedback fd, waking up the monitor</div><div>[2017-09-29 15:53:35.50920] I [master(/gfs/brick1/gv0):1515:register] _GMaster: Working dir<span style="white-space:pre">        </span>path=/var/lib/misc/glusterfsd/gfsvol/ssh%3A%2F%2Fgeo-rep-user%4010.1.1.104%3Agluster%3A%2F%2F127.0.0.1%3Agfsvol_rep/f0393acbf9a1583960edbbd2f1dfb6b4</div><div>[2017-09-29 15:53:35.51227] I [resource(/gfs/brick1/gv0):1654:service_loop] GLUSTER: Register time<span style="white-space:pre">        </span>time=1506700415</div><div>[2017-09-29 15:53:35.64343] I [gsyncdstatus(/gfs/brick1/gv0):275:set_active] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Active</div><div>[2017-09-29 15:53:35.65696] I [gsyncdstatus(/gfs/brick1/gv0):247:set_worker_crawl_status] GeorepStatus: Crawl Status Change<span style="white-space:pre">        </span>status=History Crawl</div><div>[2017-09-29 15:53:35.65915] I [master(/gfs/brick1/gv0):1429:crawl] _GMaster: starting history crawl<span style="white-space:pre">        </span>turns=1<span style="white-space:pre">        </span>stime=(1506637819, 0)<span style="white-space:pre">        </span>entry_stime=None<span style="white-space:pre">        </span>etime=1506700415</div><div>[2017-09-29 15:53:36.68135] I [master(/gfs/brick1/gv0):1458:crawl] _GMaster: slave&#39;s time<span style="white-space:pre">        </span>stime=(1506637819, 0)</div><div>[2017-09-29 15:53:36.578717] I [master(/gfs/brick1/gv0):1860:syncjob] Syncer: Sync Time Taken<span style="white-space:pre">        </span>duration=0.0376<span style="white-space:pre">        </span>num_files=1<span style="white-space:pre">        </span>job=1<span style="white-space:pre">        </span>return_code=12</div><div>[2017-09-29 15:53:36.578946] E [resource(/gfs/brick1/gv0):208:errlog] Popen: command returned error<span style="white-space:pre">        </span>cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-2pGnVA/78cf8b204207154de59d7ac32eee737f.sock --compress geo-rep-user@gfs6:/proc/17648/cwd<span style="white-space:pre">        </span>error=12</div><div>[2017-09-29 15:53:36.590887] I [syncdutils(/gfs/brick1/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:36.596421] I [repce(/gfs/brick1/gv0):92:service_loop] RepceServer: terminating on reaching EOF.</div><div>[2017-09-29 15:53:36.596635] I [syncdutils(/gfs/brick1/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:37.41075] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase<span style="white-space:pre">        </span>brick=/gfs/brick1/gv0</div><div>[2017-09-29 15:53:37.44637] I [gsyncdstatus(monitor):242:set_worker_status] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Faulty</div><div>[2017-09-29 15:53:40.351263] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker<span style="white-space:pre">        </span>brick=/gfs/brick2/gv0<span style="white-space:pre">        </span>slave_node=ssh://geo-rep-user@gfs6:gluster://localhost:gfsvol_rep</div><div>[2017-09-29 15:53:40.484637] I [resource(/gfs/brick2/gv0):1772:connect_remote] SSH: Initializing SSH connection between master and slave...</div><div>[2017-09-29 15:53:40.497215] I [changelogagent(/gfs/brick2/gv0):73:__init__] ChangelogAgent: Agent listining...</div><div>[2017-09-29 15:53:42.278539] I [resource(/gfs/brick2/gv0):1779:connect_remote] SSH: SSH connection between master and slave established.<span style="white-space:pre">        </span>duration=1.7936</div><div>[2017-09-29 15:53:42.278747] I [resource(/gfs/brick2/gv0):1494:connect] GLUSTER: Mounting gluster volume locally...</div><div>[2017-09-29 15:53:42.851296] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker<span style="white-space:pre">        </span>brick=/gfs/arbiter/gv0<span style="white-space:pre">        </span>slave_node=ssh://geo-rep-user@gfs5:gluster://localhost:gfsvol_rep</div><div>[2017-09-29 15:53:42.985567] I [resource(/gfs/arbiter/gv0):1772:connect_remote] SSH: Initializing SSH connection between master and slave...</div><div>[2017-09-29 15:53:42.986390] I [changelogagent(/gfs/arbiter/gv0):73:__init__] ChangelogAgent: Agent listining...</div><div>[2017-09-29 15:53:43.377480] I [resource(/gfs/brick2/gv0):1507:connect] GLUSTER: Mounted gluster volume<span style="white-space:pre">        </span>duration=1.0986</div><div>[2017-09-29 15:53:43.377681] I [gsyncd(/gfs/brick2/gv0):799:main_i] &lt;top&gt;: Closing feedback fd, waking up the monitor</div><div>[2017-09-29 15:53:44.767873] I [resource(/gfs/arbiter/gv0):1779:connect_remote] SSH: SSH connection between master and slave established.<span style="white-space:pre">        </span>duration=1.7821</div><div>[2017-09-29 15:53:44.768059] I [resource(/gfs/arbiter/gv0):1494:connect] GLUSTER: Mounting gluster volume locally...</div><div>[2017-09-29 15:53:45.393150] I [master(/gfs/brick2/gv0):1515:register] _GMaster: Working dir<span style="white-space:pre">        </span>path=/var/lib/misc/glusterfsd/gfsvol/ssh%3A%2F%2Fgeo-rep-user%4010.1.1.104%3Agluster%3A%2F%2F127.0.0.1%3Agfsvol_rep/1eb15856c627f181513bf23f8bf2f9d0</div><div>[2017-09-29 15:53:45.393373] I [resource(/gfs/brick2/gv0):1654:service_loop] GLUSTER: Register time<span style="white-space:pre">        </span>time=1506700425</div><div>[2017-09-29 15:53:45.404992] I [gsyncdstatus(/gfs/brick2/gv0):275:set_active] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Active</div><div>[2017-09-29 15:53:45.406404] I [gsyncdstatus(/gfs/brick2/gv0):247:set_worker_crawl_status] GeorepStatus: Crawl Status Change<span style="white-space:pre">        </span>status=History Crawl</div><div>[2017-09-29 15:53:45.406660] I [master(/gfs/brick2/gv0):1429:crawl] _GMaster: starting history crawl<span style="white-space:pre">        </span>turns=1<span style="white-space:pre">        </span>stime=(1506637819, 0)<span style="white-space:pre">        </span>entry_stime=None<span style="white-space:pre">        </span>etime=1506700425</div><div>[2017-09-29 15:53:45.863256] I [resource(/gfs/arbiter/gv0):1507:connect] GLUSTER: Mounted gluster volume<span style="white-space:pre">        </span>duration=1.0950</div><div>[2017-09-29 15:53:45.863430] I [gsyncd(/gfs/arbiter/gv0):799:main_i] &lt;top&gt;: Closing feedback fd, waking up the monitor</div><div>[2017-09-29 15:53:46.408814] I [master(/gfs/brick2/gv0):1458:crawl] _GMaster: slave&#39;s time<span style="white-space:pre">        </span>stime=(1506637819, 0)</div><div>[2017-09-29 15:53:46.920937] I [master(/gfs/brick2/gv0):1860:syncjob] Syncer: Sync Time Taken<span style="white-space:pre">        </span>duration=0.0363<span style="white-space:pre">        </span>num_files=1<span style="white-space:pre">        </span>job=3<span style="white-space:pre">        </span>return_code=12</div><div>[2017-09-29 15:53:46.921140] E [resource(/gfs/brick2/gv0):208:errlog] Popen: command returned error<span style="white-space:pre">        </span>cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-DCruqU/78cf8b204207154de59d7ac32eee737f.sock --compress geo-rep-user@gfs6:/proc/17747/cwd<span style="white-space:pre">        </span>error=12</div><div>[2017-09-29 15:53:46.937288] I [syncdutils(/gfs/brick2/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:46.940479] I [repce(/gfs/brick2/gv0):92:service_loop] RepceServer: terminating on reaching EOF.</div><div>[2017-09-29 15:53:46.940772] I [syncdutils(/gfs/brick2/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:47.151477] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker<span style="white-space:pre">        </span>brick=/gfs/brick1/gv0<span style="white-space:pre">        </span>slave_node=ssh://geo-rep-user@gfs6:gluster://localhost:gfsvol_rep</div><div>[2017-09-29 15:53:47.303791] I [resource(/gfs/brick1/gv0):1772:connect_remote] SSH: Initializing SSH connection between master and slave...</div><div>[2017-09-29 15:53:47.316878] I [changelogagent(/gfs/brick1/gv0):73:__init__] ChangelogAgent: Agent listining...</div><div>[2017-09-29 15:53:47.382605] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase<span style="white-space:pre">        </span>brick=/gfs/brick2/gv0</div><div>[2017-09-29 15:53:47.387926] I [gsyncdstatus(monitor):242:set_worker_status] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Faulty</div><div>[2017-09-29 15:53:47.876825] I [master(/gfs/arbiter/gv0):1515:register] _GMaster: Working dir<span style="white-space:pre">        </span>path=/var/lib/misc/glusterfsd/gfsvol/ssh%3A%2F%2Fgeo-rep-user%4010.1.1.104%3Agluster%3A%2F%2F127.0.0.1%3Agfsvol_rep/40efd54bad1d5828a1221dd560de376f</div><div>[2017-09-29 15:53:47.877044] I [resource(/gfs/arbiter/gv0):1654:service_loop] GLUSTER: Register time<span style="white-space:pre">        </span>time=1506700427</div><div>[2017-09-29 15:53:47.888930] I [gsyncdstatus(/gfs/arbiter/gv0):275:set_active] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Active</div><div>[2017-09-29 15:53:47.890043] I [gsyncdstatus(/gfs/arbiter/gv0):247:set_worker_crawl_status] GeorepStatus: Crawl Status Change<span style="white-space:pre">        </span>status=History Crawl</div><div>[2017-09-29 15:53:47.890285] I [master(/gfs/arbiter/gv0):1429:crawl] _GMaster: starting history crawl<span style="white-space:pre">        </span>turns=1<span style="white-space:pre">        </span>stime=(1506637819, 0)<span style="white-space:pre">        </span>entry_stime=None<span style="white-space:pre">        </span>etime=1506700427</div><div>[2017-09-29 15:53:48.891966] I [master(/gfs/arbiter/gv0):1458:crawl] _GMaster: slave&#39;s time<span style="white-space:pre">        </span>stime=(1506637819, 0)</div><div>[2017-09-29 15:53:48.998140] I [resource(/gfs/brick1/gv0):1779:connect_remote] SSH: SSH connection between master and slave established.<span style="white-space:pre">        </span>duration=1.6942</div><div>[2017-09-29 15:53:48.998330] I [resource(/gfs/brick1/gv0):1494:connect] GLUSTER: Mounting gluster volume locally...</div><div>[2017-09-29 15:53:49.406749] I [master(/gfs/arbiter/gv0):1860:syncjob] Syncer: Sync Time Taken<span style="white-space:pre">        </span>duration=0.0383<span style="white-space:pre">        </span>num_files=1<span style="white-space:pre">        </span>job=2<span style="white-space:pre">        </span>return_code=12</div><div>[2017-09-29 15:53:49.406999] E [resource(/gfs/arbiter/gv0):208:errlog] Popen: command returned error<span style="white-space:pre">        </span>cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-5VeNKp/5f1d38555e12d0018fb6ed1e6bd63023.sock --compress geo-rep-user@gfs5:/proc/8448/cwd<span style="white-space:pre">        </span>error=12</div><div>[2017-09-29 15:53:49.426301] I [syncdutils(/gfs/arbiter/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:49.428428] I [repce(/gfs/arbiter/gv0):92:service_loop] RepceServer: terminating on reaching EOF.</div><div>[2017-09-29 15:53:49.428618] I [syncdutils(/gfs/arbiter/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:49.868974] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase<span style="white-space:pre">        </span>brick=/gfs/arbiter/gv0</div><div>[2017-09-29 15:53:49.872705] I [gsyncdstatus(monitor):242:set_worker_status] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Faulty</div><div>[2017-09-29 15:53:50.78377] I [resource(/gfs/brick1/gv0):1507:connect] GLUSTER: Mounted gluster volume<span style="white-space:pre">        </span>duration=1.0799</div><div>[2017-09-29 15:53:50.78643] I [gsyncd(/gfs/brick1/gv0):799:main_i] &lt;top&gt;: Closing feedback fd, waking up the monitor</div><div>[2017-09-29 15:53:52.93027] I [master(/gfs/brick1/gv0):1515:register] _GMaster: Working dir<span style="white-space:pre">        </span>path=/var/lib/misc/glusterfsd/gfsvol/ssh%3A%2F%2Fgeo-rep-user%4010.1.1.104%3Agluster%3A%2F%2F127.0.0.1%3Agfsvol_rep/f0393acbf9a1583960edbbd2f1dfb6b4</div><div>[2017-09-29 15:53:52.93331] I [resource(/gfs/brick1/gv0):1654:service_loop] GLUSTER: Register time<span style="white-space:pre">        </span>time=1506700432</div><div>[2017-09-29 15:53:52.107558] I [gsyncdstatus(/gfs/brick1/gv0):275:set_active] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Active</div><div>[2017-09-29 15:53:52.108943] I [gsyncdstatus(/gfs/brick1/gv0):247:set_worker_crawl_status] GeorepStatus: Crawl Status Change<span style="white-space:pre">        </span>status=History Crawl</div><div>[2017-09-29 15:53:52.109178] I [master(/gfs/brick1/gv0):1429:crawl] _GMaster: starting history crawl<span style="white-space:pre">        </span>turns=1<span style="white-space:pre">        </span>stime=(1506637819, 0)<span style="white-space:pre">        </span>entry_stime=None<span style="white-space:pre">        </span>etime=1506700432</div><div>[2017-09-29 15:53:53.111017] I [master(/gfs/brick1/gv0):1458:crawl] _GMaster: slave&#39;s time<span style="white-space:pre">        </span>stime=(1506637819, 0)</div><div>[2017-09-29 15:53:53.622422] I [master(/gfs/brick1/gv0):1860:syncjob] Syncer: Sync Time Taken<span style="white-space:pre">        </span>duration=0.0369<span style="white-space:pre">        </span>num_files=1<span style="white-space:pre">        </span>job=2<span style="white-space:pre">        </span>return_code=12</div><div>[2017-09-29 15:53:53.622683] E [resource(/gfs/brick1/gv0):208:errlog] Popen: command returned error<span style="white-space:pre">        </span>cmd=rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-DBB9pL/78cf8b204207154de59d7ac32eee737f.sock --compress geo-rep-user@gfs6:/proc/17837/cwd<span style="white-space:pre">        </span>error=12</div><div>[2017-09-29 15:53:53.635057] I [syncdutils(/gfs/brick1/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:53.639909] I [repce(/gfs/brick1/gv0):92:service_loop] RepceServer: terminating on reaching EOF.</div><div>[2017-09-29 15:53:53.640172] I [syncdutils(/gfs/brick1/gv0):271:finalize] &lt;top&gt;: exiting.</div><div>[2017-09-29 15:53:54.85591] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase<span style="white-space:pre">        </span>brick=/gfs/brick1/gv0</div><div>[2017-09-29 15:53:54.89509] I [gsyncdstatus(monitor):242:set_worker_status] GeorepStatus: Worker Status Change<span style="white-space:pre">        </span>status=Faulty</div></div><div><br></div><div>I think the error has to do with this part:</div><div>rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --existing --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-DBB9pL/78cf8b204207154de59d7ac32eee737f.sock --compress geo-rep-user@gfs6:/proc/17837/cwd</div><div>especially the ssh part since I notice a lot of failed log in attempts when geo replication is running</div><div><br></div><div>Please can anybody help advise what to do in this situation?</div></div>