[Bugs] [Bug 1284735] New: [BACKUP]: If more than 1 node in cluster are not added in known_host, glusterfind create command hungs

bugzilla at redhat.com bugzilla at redhat.com
Tue Nov 24 03:55:22 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1284735

            Bug ID: 1284735
           Summary: [BACKUP]: If more than 1 node in cluster are not added
                    in known_host, glusterfind create command hungs
           Product: GlusterFS
           Version: 3.7.6
         Component: glusterfind
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: avishwan at redhat.com
        QA Contact: bugs at gluster.org
                CC: avishwan at redhat.com, bugs at gluster.org,
                    khiremat at redhat.com, rhinduja at redhat.com,
                    rhs-bugs at redhat.com, sanandpa at redhat.com
        Depends On: 1260119, 1260918



+++ This bug was initially created as a clone of Bug #1260918 +++

+++ This bug was initially created as a clone of Bug #1260119 +++

Description of problem:
======================

If more than 1 node from cluster do not have entry in the known_host of a node
which is creating glusterfind session, the create hungs forever.

[root at georep1 scripts]# glusterfind create s1 master
The authenticity of host '10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? yes


[root at georep1 scripts]# cat /root/.ssh/known_hosts


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.1-14.el7rhgs.x86_64

How reproducible:
=================

Always

Steps to Reproduce:
===================
1. Flush the known_hosts from node or remove the cluster host entries
2. Create glusterfind session


Actual results:
===============
glusterfind session creation hungs

Expected results:
================

Should create the session


Workaround:
===========

SSH to all the nodes in cluster to have known_hosts updated.

[root at georep1 scripts]# cat /root/.ssh/known_hosts
[root at georep1 scripts]# for i in {97,93,154}; do ssh root at 10.70.46.$i; doneThe
authenticity of host '10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.70.46.97' (ECDSA) to the list of known hosts.
root at 10.70.46.97's password: 
Last login: Fri Sep  4 12:38:50 2015 from 10.70.6.115
[root at georep2 ~]# exit
logout
Connection to 10.70.46.97 closed.
The authenticity of host '10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.70.46.93' (ECDSA) to the list of known hosts.
root at 10.70.46.93's password: 
Last login: Fri Sep  4 12:38:50 2015 from 10.70.6.115
[root at georep3 ~]# exit
logout
Connection to 10.70.46.93 closed.
The authenticity of host '10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.70.46.154' (ECDSA) to the list of known hosts.
root at 10.70.46.154's password: 
Last login: Fri Sep  4 12:38:50 2015 from 10.70.6.115
[root at georep4 ~]# exit
logout
Connection to 10.70.46.154 closed.
[root at georep1 scripts]# cat /root/.ssh/known_hosts
10.70.46.97 ecdsa-sha2-nistp256
AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNDWD/hxpscM20kGEWOTsiIzgmnBd78d2uyQRI7AGIX2JRRr0hIoZPOGCrW/ytRpluPEnJVr7s+vAYglVYLZlOo=
10.70.46.93 ecdsa-sha2-nistp256
AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBGCE5KvbJtmgXmQXfVVjUVjG0bjkP7fb0v7owFJnzAxy5FKjtTDQSF+qVAHA17MBh9Br7KP+SZQOxSmHyY9Tq8s=
10.70.46.154 ecdsa-sha2-nistp256
AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNMULJn47vZ1Azq/SCi4i5VBSrLQAqs6sMZTSamzpwkhedtHrNhKe5QW7W5l+mirLJTIrLuqy8HQYSp5jDYyfrk=
[root at georep1 scripts]# glusterfind create s1 master
Session s1 created with volume master
[root at georep1 scripts]#

--- Additional comment from Rahul Hinduja on 2015-09-07 02:28:54 EDT ---

glusterfind pre also hungs for local host checking as: 


[root at georep1 scripts]# glusterfind pre --output-prefix '/mnt/glusterfs/' s1
master /root/log2
10.70.46.97 - pre failed: /rhs/brick2/b6 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.97 - pre failed: /rhs/brick3/b10 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.97 - pre failed: /rhs/brick1/b2 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.93 - pre failed: /rhs/brick1/b3 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.93 - pre failed: /rhs/brick2/b7 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.93 - pre failed: /rhs/brick3/b11 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.154 - pre failed: /rhs/brick2/b8 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.154 - pre failed: /rhs/brick3/b12 Historical Changelogs not available:
[Errno 2] No such file or directory

10.70.46.154 - pre failed: /rhs/brick1/b4 Historical Changelogs not available:
[Errno 2] No such file or directory

The authenticity of host '10.70.46.96 (10.70.46.96)' can't be established.
ECDSA key fingerprint is 44:23:1a:4b:3c:78:63:a6:66:3d:18:01:8d:dd:17:74.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.96 (10.70.46.96)' can't be established.
ECDSA key fingerprint is 44:23:1a:4b:3c:78:63:a6:66:3d:18:01:8d:dd:17:74.
Are you sure you want to continue connecting (yes/no)? The authenticity of host
'10.70.46.96 (10.70.46.96)' can't be established.
ECDSA key fingerprint is 44:23:1a:4b:3c:78:63:a6:66:3d:18:01:8d:dd:17:74.
Are you sure you want to continue connecting (yes/no)? yes

--- Additional comment from Aravinda VK on 2015-09-07 06:52:20 EDT ---

RCA:

While connecting to other nodes programatically, Geo-rep uses an additional
option with ssh(-oStrictHostKeyChecking=no). We need to use the option with
Glusterfind too.

Other issue is about asking yes/no prompt for localhost, which is during scp
command. We need to use the same option as used in ssh. Other fix is required
in not running scp command if local node.

Workaround:
Add all the hosts in peer including local node to known_hosts.

--- Additional comment from Vijay Bellur on 2015-09-08 04:46:19 EDT ---

REVIEW: http://review.gluster.org/12124 (tools/glusterfind:
StrictHostKeyChecking=no for ssh/scp verification) posted (#1) for review on
master by Aravinda VK (avishwan at redhat.com)

--- Additional comment from Vijay Bellur on 2015-11-19 00:08:11 EST ---

REVIEW: http://review.gluster.org/12124 (tools/glusterfind:
StrictHostKeyChecking=no for ssh/scp verification) posted (#2) for review on
master by Aravinda VK (avishwan at redhat.com)

--- Additional comment from Vijay Bellur on 2015-11-21 09:19:28 EST ---

REVIEW: http://review.gluster.org/12124 (tools/glusterfind:
StrictHostKeyChecking=no for ssh/scp verification) posted (#3) for review on
master by Aravinda VK (avishwan at redhat.com)

--- Additional comment from Vijay Bellur on 2015-11-23 00:02:46 EST ---

REVIEW: http://review.gluster.org/12124 (tools/glusterfind:
StrictHostKeyChecking=no for ssh/scp verification) posted (#4) for review on
master by Aravinda VK (avishwan at redhat.com)

--- Additional comment from Vijay Bellur on 2015-11-23 12:29:27 EST ---

COMMIT: http://review.gluster.org/12124 committed in master by Vijay Bellur
(vbellur at redhat.com) 
------
commit d47323d0e6f543a8ece04c32b8d77d2785390c3c
Author: Aravinda VK <avishwan at redhat.com>
Date:   Mon Sep 7 14:18:45 2015 +0530

    tools/glusterfind: StrictHostKeyChecking=no for ssh/scp verification

    Also do not use scp command in case copy file from local
    node.

    Change-Id: Ie78c77eb0252945867173937391b82001f29c3b0
    Signed-off-by: Aravinda VK <avishwan at redhat.com>
    BUG: 1260918
    Reviewed-on: http://review.gluster.org/12124
    Tested-by: NetBSD Build System <jenkins at build.gluster.org>
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Vijay Bellur <vbellur at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1260119
[Bug 1260119] [BACKUP]: If more than 1 node in cluster are not added in
known_host, glusterfind create command hungs
https://bugzilla.redhat.com/show_bug.cgi?id=1260918
[Bug 1260918] [BACKUP]: If more than 1 node in cluster are not added in
known_host, glusterfind create command hungs
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list