[Gluster-users] 答复: 答复: 答复: 答复: 答复: geo-replication status partial faulty
vyyy杨雨阳
yuyangyang at Ctrip.com
Tue May 24 09:49:55 UTC 2016
We can establish passwordless ssh directly with command 'ssh' , but when create push-pem, it shows ' Passwordless ssh login has not been setup ' unless copy secret.pem to *id_rsa.pub
[root at SVR8048HW2285 ~]# ssh -i /var/lib/glusterd/geo-replication/secret.pem root at glusterfs01.sh3.ctripcorp.com
Last login: Tue May 24 17:23:53 2016 from 10.8.230.213
This is a private network server, in monitoring state.
It is strictly prohibited to unauthorized access and used.
[root at SVR6519HW2285 ~]#
[root at SVR8048HW2285 filews]# gluster volume geo-replication filews glusterfs01.sh3.ctripcorp.com::filews_slave create push-pem force
Passwordless ssh login has not been setup with glusterfs01.sh3.ctripcorp.com for user root.
geo-replication command failed
[root at SVR8048HW2285 filews]#
Best Regards
杨雨阳 Yuyang Yang
-----邮件原件-----
发件人: Kotresh Hiremath Ravishankar [mailto:khiremat at redhat.com]
发送时间: Tuesday, May 24, 2016 3:22 PM
收件人: vyyy杨雨阳 <yuyangyang at Ctrip.com>
抄送: Saravanakumar Arumugam <sarumuga at redhat.com>; Gluster-users at gluster.org; Aravinda Vishwanathapura Krishna Murthy <avishwan at redhat.com>
主题: Re: 答复: 答复: 答复: [Gluster-users] 答复: geo-replication status partial faulty
Hi
Could you try following command from corresponding masters to faulty slave nodes and share the output?
The below command should not ask for password and should run gsync.py.
ssh -i /var/lib/glusterd/geo-replication/secret.pem root@<faulty hosts>
To establish passwordless ssh, it is not necessary to copy secret.pem to *id_rsa.pub.
If the geo-rep session is already established, passwordless ssh would already be there.
My suspect is that when I asked you to do 'create force' you did it using another slave where password less ssh was not setup. This would create another session directory in '/var/lib/glusterd/geo-replication' i.e (<master_vol>_<slave_host>_<slave_vol>)
Please check and let us know.
Thanks and Regards,
Kotresh H R
----- Original Message -----
> From: "vyyy杨雨阳" <yuyangyang at ctrip.com>
> To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> Cc: "Saravanakumar Arumugam" <sarumuga at redhat.com>,
> Gluster-users at gluster.org, "Aravinda Vishwanathapura Krishna Murthy"
> <avishwan at redhat.com>
> Sent: Friday, May 20, 2016 12:35:58 PM
> Subject: 答复: 答复: 答复: [Gluster-users] 答复: geo-replication status
> partial faulty
>
> Hello, Kotresh
>
> I 'create force', but still some nodes work ,some nodes faulty.
>
> On faulty nodes
> etc-glusterfs-glusterd.vol.log shown:
> [2016-05-20 06:27:03.260870] I
> [glusterd-geo-rep.c:3516:glusterd_read_status_file] 0-: Using passed
> config template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).
> [2016-05-20 06:27:03.404544] E
> [glusterd-geo-rep.c:3200:glusterd_gsync_read_frm_status] 0-: Unable to
> read gsyncd status file
> [2016-05-20 06:27:03.404583] E
> [glusterd-geo-rep.c:3603:glusterd_read_status_file] 0-: Unable to read
> the statusfile for /export/sdb/brick1 brick for filews(master),
> glusterfs01.sh3.ctripcorp.com::filews_slave(slave) session
>
>
> /var/log/glusterfs/geo-replication/filews/ssh%3A%2F%2Froot%4010.15.65.
> 66%3Agluster%3A%2F%2F127.0.0.1%3Afilews_slave.log
> shown:
> [2016-05-20 15:04:01.858340] I [monitor(monitor):215:monitor] Monitor:
> ------------------------------------------------------------
> [2016-05-20 15:04:01.858688] I [monitor(monitor):216:monitor] Monitor:
> starting gsyncd worker
> [2016-05-20 15:04:01.986754] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
> '7,11,10,9'
> [2016-05-20 15:04:01.987505] I [changelogagent(agent):72:__init__]
> ChangelogAgent: Agent listining...
> [2016-05-20 15:04:01.988079] I [repce(agent):92:service_loop] RepceServer:
> terminating on reaching EOF.
> [2016-05-20 15:04:01.988238] I [syncdutils(agent):214:finalize] <top>:
> exiting.
> [2016-05-20 15:04:01.988250] I [monitor(monitor):267:monitor] Monitor:
> worker(/export/sdb/brick1) died before establishing connection
>
> Can you help me!
>
>
> Best Regards
> 杨雨阳 Yuyang Yang
>
>
>
> -----邮件原件-----
> 发件人: vyyy杨雨阳
> 发送时间: Thursday, May 19, 2016 7:45 PM
> 收件人: 'Kotresh Hiremath Ravishankar' <khiremat at redhat.com>
> 抄送: Saravanakumar Arumugam <sarumuga at redhat.com>;
> Gluster-users at gluster.org; Aravinda Vishwanathapura Krishna Murthy
> <avishwan at redhat.com>
> 主题: 答复: 答复: 答复: [Gluster-users] 答复: geo-replication status partial
> faulty
>
> Still not work.
>
> I need copy /var/lib/glusterd/geo-replication/secret.* to
> /root/.ssh/id_rsa to make passwordless ssh work.
>
> I generate /var/lib/glusterd/geo-replication/secret.pem file on
> every master nodes.
>
> I am not sure is this right.
>
>
> [root at sh02svr5956 ~]# gluster volume geo-replication filews
> glusterfs01.sh3.ctripcorp.com::filews_slave create push-pem force
> Passwordless ssh login has not been setup with
> glusterfs01.sh3.ctripcorp.com for user root.
> geo-replication command failed
>
> [root at sh02svr5956 .ssh]# cp
> /var/lib/glusterd/geo-replication/secret.pem
> ./id_rsa
> cp: overwrite `./id_rsa'? y
> [root at sh02svr5956 .ssh]# cp
> /var/lib/glusterd/geo-replication/secret.pem.pub
> ./id_rsa.pub
> cp: overwrite `./id_rsa.pub'?
>
> [root at sh02svr5956 ~]# gluster volume geo-replication filews
> glusterfs01.sh3.ctripcorp.com::filews_slave create push-pem force
> Creating geo-replication session between filews &
> glusterfs01.sh3.ctripcorp.com::filews_slave has been successful
> [root at sh02svr5956 ~]#
>
>
>
>
> Best Regards
> 杨雨阳 Yuyang Yang
> OPS
> Ctrip Infrastructure Service (CIS)
> Ctrip Computer Technology (Shanghai) Co., Ltd
> Phone: + 86 21 34064880-15554 | Fax: + 86 21 52514588-13389
> Web: www.Ctrip.com
>
>
> -----邮件原件-----
> 发件人: Kotresh Hiremath Ravishankar [mailto:khiremat at redhat.com]
> 发送时间: Thursday, May 19, 2016 5:07 PM
> 收件人: vyyy杨雨阳 <yuyangyang at Ctrip.com>
> 抄送: Saravanakumar Arumugam <sarumuga at redhat.com>;
> Gluster-users at gluster.org; Aravinda Vishwanathapura Krishna Murthy
> <avishwan at redhat.com>
> 主题: Re: 答复: 答复: [Gluster-users] 答复: geo-replication status partial
> faulty
>
> Hi,
>
> Could you just try 'create force' once to fix those status file errors?
>
> e.g., 'gluster volume geo-rep <master vol> <slave host>::<slave vol>
> create push-pem force
>
> Thanks and Regards,
> Kotresh H R
>
> ----- Original Message -----
> > From: "vyyy杨雨阳" <yuyangyang at ctrip.com>
> > To: "Saravanakumar Arumugam" <sarumuga at redhat.com>,
> > Gluster-users at gluster.org, "Aravinda Vishwanathapura Krishna Murthy"
> > <avishwan at redhat.com>, "Kotresh Hiremath Ravishankar"
> > <khiremat at redhat.com>
> > Sent: Thursday, May 19, 2016 2:15:34 PM
> > Subject: 答复: 答复: [Gluster-users] 答复: geo-replication status partial
> > faulty
> >
> > I have checked all the nodes both on masters and slaves, the
> > software is the same.
> >
> > I am puzzled why there were half masters work, halt faulty.
> >
> >
> > [admin at SVR6996HW2285 ~]$ rpm -qa |grep gluster
> > glusterfs-api-3.6.3-1.el6.x86_64
> > glusterfs-fuse-3.6.3-1.el6.x86_64
> > glusterfs-geo-replication-3.6.3-1.el6.x86_64
> > glusterfs-3.6.3-1.el6.x86_64
> > glusterfs-cli-3.6.3-1.el6.x86_64
> > glusterfs-server-3.6.3-1.el6.x86_64
> > glusterfs-libs-3.6.3-1.el6.x86_64
> >
> >
> >
> >
> > Best Regards
> > 杨雨阳 Yuyang Yang
> >
> > OPS
> > Ctrip Infrastructure Service (CIS)
> > Ctrip Computer Technology (Shanghai) Co., Ltd
> > Phone: + 86 21 34064880-15554 | Fax: + 86 21 52514588-13389
> > Web: www.Ctrip.com<http://www.ctrip.com/>
> >
> >
> >
> > 发件人: Saravanakumar Arumugam [mailto:sarumuga at redhat.com]
> > 发送时间: Thursday, May 19, 2016 4:33 PM
> > 收件人: vyyy杨雨阳 <yuyangyang at Ctrip.com>; Gluster-users at gluster.org;
> > Aravinda Vishwanathapura Krishna Murthy <avishwan at redhat.com>;
> > Kotresh Hiremath Ravishankar <khiremat at redhat.com>
> > 主题: Re: 答复: [Gluster-users] 答复: geo-replication status partial
> > faulty
> >
> > Hi,
> > +geo-rep team.
> >
> > Can you get the gluster version you are using?
> >
> > # For example:
> > rpm -qa | grep gluster
> >
> > I hope you have same gluster version installed everywhere.
> > Please double check and share the same.
> >
> > Thanks,
> > Saravana
> > On 05/19/2016 01:37 PM, vyyy杨雨阳 wrote:
> > Hi, Saravana
> >
> > I have changed log level to DEBUG. Then start geo-replication with
> > log-file option, attached the file.
> >
> > gluster volume geo-replication filews
> > glusterfs01.sh3.ctripcorp.com::filews_slave start --log-file=geo.log
> >
> > I have checked /root/.ssh/authorized_keys in
> > glusterfs01.sh3.ctripcorp.com , It have entries in
> > /var/lib/glusterd/geo-replication/common_secret.pem.pub.
> > and I have removed the lines not started with “command=”
> >
> > ssh -i /var/lib/glusterd/geo-replication/secret.pem root@
> > glusterfs01.sh3.ctripcorp.com I can see gsyncd messages and no ssh
> > error.
> >
> >
> > Attached etc-glusterfs-glusterd.vol.log from faulty node, it shows :
> >
> > [2016-05-19 06:39:23.405974] I
> > [glusterd-geo-rep.c:3516:glusterd_read_status_file] 0-: Using passed
> > config
> > template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).
> > [2016-05-19 06:39:23.541169] E
> > [glusterd-geo-rep.c:3200:glusterd_gsync_read_frm_status] 0-: Unable
> > to read gsyncd status file
> > [2016-05-19 06:39:23.541210] E
> > [glusterd-geo-rep.c:3603:glusterd_read_status_file] 0-: Unable to
> > read the statusfile for /export/sdb/filews brick for
> > filews(master),
> > glusterfs01.sh3.ctripcorp.com::filews_slave(slave) session
> > [2016-05-19 06:39:29.472047] I
> > [glusterd-geo-rep.c:1835:glusterd_get_statefile_name] 0-: Using
> > passed config
> > template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).
> > [2016-05-19 06:39:34.939709] I
> > [glusterd-geo-rep.c:3516:glusterd_read_status_file] 0-: Using passed
> > config
> > template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).
> > [2016-05-19 06:39:35.058520] E
> > [glusterd-geo-rep.c:3200:glusterd_gsync_read_frm_status] 0-: Unable
> > to read gsyncd status file
> >
> >
> > /var/log/glusterfs/geo-replication/filews/
> > ssh%3A%2F%2Froot%4010.15.65.66%3Agluster%3A%2F%2F127.0.0.1%3Afilews_
> > sl
> > ave.log
> > shows as following:
> >
> > [2016-05-19 15:11:37.307755] I [monitor(monitor):215:monitor] Monitor:
> > ------------------------------------------------------------
> > [2016-05-19 15:11:37.308059] I [monitor(monitor):216:monitor] Monitor:
> > starting gsyncd worker
> > [2016-05-19 15:11:37.423320] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
> > '7,11,10,9'
> > [2016-05-19 15:11:37.423882] I [changelogagent(agent):72:__init__]
> > ChangelogAgent: Agent listining...
> > [2016-05-19 15:11:37.423906] I [monitor(monitor):267:monitor] Monitor:
> > worker(/export/sdb/filews) died before establishing connection
> > [2016-05-19 15:11:37.424151] I [repce(agent):92:service_loop] RepceServer:
> > terminating on reaching EOF.
> > [2016-05-19 15:11:37.424335] I [syncdutils(agent):214:finalize] <top>:
> > exiting.
> >
> >
> >
> >
> >
> >
> > Best Regards
> > Yuyang Yang
> >
> >
> >
> >
> >
> > 发 件人: Saravanakumar Arumugam [mailto:sarumuga at redhat.com]and share what's the output?
> > 发送时间: Thursday, May 19, 2016 1:59 PM
> > 收件人: vyyy杨雨阳 <yuyangyang at Ctrip.com><mailto:yuyangyang at Ctrip.com>;
> > Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
> > 主题: Re: [Gluster-users] 答复: geo-replication status partial faulty
> >
> > Hi,
> >
> > There seems to be some issue in glusterfs01.sh3.ctripcorp.com slave node.
> > Can you share the complete logs ?
> >
> > You can increase verbosity of debug messages like this:
> > gluster volume geo-replication <master volume> <slave host>::<slave
> > volume> config log-level DEBUG
> >
> >
> > Also, check /root/.ssh/authorized_keys in
> > glusterfs01.sh3.ctripcorp.com It should have entries in
> > /var/lib/glusterd/geo-replication/common_secret.pem.pub (present in
> > master node).
> >
> > Have a look at this one for example:
> > https://www.gluster.org/pipermail/gluster-users/2015-August/023174.h
> > tm
> > l
> >
> > Thanks,
> > Saravana
> > On 05/19/2016 07:53 AM, vyyy杨雨阳 wrote:
> > Hello,
> >
> > I have tried to config a geo-replication volume , all the master
> > nodes configuration are the same, When I start this volume, the
> > status shows partial faulty as following:
> >
> > gluster volume geo-replication filews
> > glusterfs01.sh3.ctripcorp.com::filews_slave status
> >
> > MASTER NODE MASTER VOL MASTER BRICK SLAVE
> > STATUS CHECKPOINT STATUS
> > CRAWL STATUS
> > -------------------------------------------------------------------------------------------------------------------------------------------------
> > SVR8048HW2285 filews /export/sdb/filews
> > glusterfs01.sh3.ctripcorp.com::filews_slave faulty N/A
> > N/A
> > SVR8050HW2285 filews /export/sdb/filews
> > glusterfs03.sh3.ctripcorp.com::filews_slave Passive N/A
> > N/A
> > SVR8047HW2285 filews /export/sdb/filews
> > glusterfs01.sh3.ctripcorp.com::filews_slave Active N/A
> > Hybrid Crawl
> > SVR8049HW2285 filews /export/sdb/filews
> > glusterfs05.sh3.ctripcorp.com::filews_slave Active N/A
> > Hybrid Crawl
> > SH02SVR5951 filews /export/sdb/brick1
> > glusterfs06.sh3.ctripcorp.com::filews_slave Passive N/A
> > N/A
> > SH02SVR5953 filews /export/sdb/brick1
> > glusterfs01.sh3.ctripcorp.com::filews_slave faulty N/A
> > N/A
> > SVR6995HW2285 filews /export/sdb/filews
> > glusterfs01.sh3.ctripcorp.com::filews_slave faulty N/A
> > N/A
> > SH02SVR5954 filews /export/sdb/brick1
> > glusterfs01.sh3.ctripcorp.com::filews_slave faulty N/A
> > N/A
> > SVR6994HW2285 filews /export/sdb/filews
> > glusterfs02.sh3.ctripcorp.com::filews_slave Passive N/A
> > N/A
> > SVR6993HW2285 filews /export/sdb/filews
> > glusterfs01.sh3.ctripcorp.com::filews_slave faulty N/A
> > N/A
> > SH02SVR5952 filews /export/sdb/brick1
> > glusterfs01.sh3.ctripcorp.com::filews_slave faulty N/A
> > N/A
> > SVR6996HW2285 filews /export/sdb/filews
> > glusterfs04.sh3.ctripcorp.com::filews_slave Passive N/A
> > N/A
> >
> > On the faulty node, log file
> > /var/log/glusterfs/geo-replication/filews
> > shows
> > worker(/export/sdb/filews) died before establishing connection
> >
> > [2016-05-18 16:55:46.402622] I [monitor(monitor):215:monitor] Monitor:
> > ------------------------------------------------------------
> > [2016-05-18 16:55:46.402930] I [monitor(monitor):216:monitor] Monitor:
> > starting gsyncd worker
> > [2016-05-18 16:55:46.517460] I [changelogagent(agent):72:__init__]
> > ChangelogAgent: Agent listining...
> > [2016-05-18 16:55:46.518066] I [repce(agent):92:service_loop] RepceServer:
> > terminating on reaching EOF.
> > [2016-05-18 16:55:46.518279] I [syncdutils(agent):214:finalize] <top>:
> > exiting.
> > [2016-05-18 16:55:46.518194] I [monitor(monitor):267:monitor] Monitor:
> > worker(/export/sdb/filews) died before establishing connection
> > [2016-05-18 16:55:56.697036] I [monitor(monitor):215:monitor] Monitor:
> > ------------------------------------------------------------
> >
> > Any advice and suggestions will be greatly appreciated.
> >
> >
> >
> >
> >
> > Best Regards
> > Yuyang Yang
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> >
> > Gluster-users mailing list
> >
> > Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
> >
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
> >
> >
>
More information about the Gluster-users
mailing list