[Gluster-users] geo-replication status faulty

Tue Apr 29 17:56:56 UTC 2014

I've also seen this behavior. I fixed it in the same method.

The strange part is that using Vagrant for testing, I didn't see the same behavior. It only happens on bare metal boxes in my case. I'm not sure why that is though... I'm using the same version CentOS, etc.

-CJ

From: Steve Dainard <sdainard at miovision.com<mailto:sdainard at miovision.com>>
Date: Tuesday, April 29, 2014 at 10:42 AM
To: "gluster-users at gluster.org<mailto:gluster-users at gluster.org> List" <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
Subject: Re: [Gluster-users] geo-replication status faulty

Fixed by editing the geo-rep volumes gsyncd.conf file, changing /nonexistent/gsyncd to /usr/libexec/glusterfs/gsyncd on both the master nodes.

Any reason why this is in the default template? Also any reason why when I stop glusterd, change the template on both master nodes and start the gluster service its overwritten?

Steve

On Tue, Apr 29, 2014 at 12:11 PM, Steve Dainard <sdainard at miovision.com<mailto:sdainard at miovision.com>> wrote:
Just setup geo-replication between two replica 2 pairs, gluster version 3.5.0.2.

Following this guide: https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html

Status is faulty/passive:

# gluster volume geo-replication rep1 10.0.11.4::rep1 status

MASTER NODE                MASTER VOL    MASTER BRICK                           SLAVE              STATUS     CHECKPOINT STATUS    CRAWL STATUS
------------------------------------------------------------------------------------------------------------------------------------------------
ovirt001.miovision.corp    rep1          /mnt/storage/lv-storage-domain/rep1    10.0.11.4::rep1    faulty     N/A                  N/A
ovirt002.miovision.corp    rep1          /mnt/storage/lv-storage-domain/rep1    10.0.11.5::rep1    Passive    N/A                  N/A

geo-replication log from master:

[2014-04-29 12:00:07.178314] I [monitor(monitor):129:monitor] Monitor: ------------------------------------------------------------
[2014-04-29 12:00:07.178550] I [monitor(monitor):130:monitor] Monitor: starting gsyncd worker
[2014-04-29 12:00:07.344643] I [gsyncd(/mnt/storage/lv-storage-domain/rep1):532:main_i] <top>: syncing: gluster://localhost:rep1 -> ssh://root@10.0.11.4:gluster://localhost:rep1
[2014-04-29 12:00:07.357718] D [repce(/mnt/storage/lv-storage-domain/rep1):175:push] RepceClient: call 21880:139789410989824:1398787207.36 __repce_version__() ...
[2014-04-29 12:00:07.631556] E [syncdutils(/mnt/storage/lv-storage-domain/rep1):223:log_raise_exception] <top>: connection to peer is broken
[2014-04-29 12:00:07.631808] W [syncdutils(/mnt/storage/lv-storage-domain/rep1):227:log_raise_exception] <top>: !!!!!!!!!!!!!
[2014-04-29 12:00:07.631947] W [syncdutils(/mnt/storage/lv-storage-domain/rep1):228:log_raise_exception] <top>: !!! getting "No such file or directory" errors is most likely due to MISCONFIGURATION, please consult https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html
[2014-04-29 12:00:07.632061] W [syncdutils(/mnt/storage/lv-storage-domain/rep1):231:log_raise_exception] <top>: !!!!!!!!!!!!!
[2014-04-29 12:00:07.632251] E [resource(/mnt/storage/lv-storage-domain/rep1):204:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-0_wqaI/cb6bb9e3af32ccbb7c8c0ae955f728db.sock root at 10.0.11.4<mailto:root at 10.0.11.4> /nonexistent/gsyncd --session-owner c0a6c74c-deb5-4ed0-9ef9-23756d593197 -N --listen --timeout 120 gluster://localhost:rep1" returned with 127, saying:
[2014-04-29 12:00:07.632396] E [resource(/mnt/storage/lv-storage-domain/rep1):207:logerr] Popen: ssh> bash: /nonexistent/gsyncd: No such file or directory
[2014-04-29 12:00:07.632689] I [syncdutils(/mnt/storage/lv-storage-domain/rep1):192:finalize] <top>: exiting.
[2014-04-29 12:00:07.634656] I [monitor(monitor):150:monitor] Monitor: worker(/mnt/storage/lv-storage-domain/rep1) died before establishing connection

Thanks,
Steve

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140429/61230a5b/attachment.html>