[Bugs] [Bug 1320458] New: Peer information is not propagated to all the nodes in the cluster, when the peer is probed with its second interface FQDN/IP

bugzilla at redhat.com bugzilla at redhat.com
Wed Mar 23 09:33:30 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1320458

            Bug ID: 1320458
           Summary: Peer information is not propagated to all the nodes in
                    the cluster, when the peer is probed with its second
                    interface FQDN/IP
           Product: GlusterFS
           Version: mainline
         Component: glusterd
          Keywords: Triaged
          Severity: high
          Assignee: kaushal at redhat.com
          Reporter: kaushal at redhat.com
                CC: bugs at gluster.org, kaushal at redhat.com,
                    mselvaga at redhat.com, sasundar at redhat.com
            Blocks: 1314366



+++ This bug was initially created as a clone of Bug #1314366 +++

Description of problem:
-----------------------
When there are multiple interfaces available in the gluster node and to make
use both the interfaces for gluster traffic, the peer probe should be done with
all the network identifiers (i.e) IP or FQDN

While doing so, the other names for the particular peer is updated.
The problem here is that the other name of the particular host is not
propogated to all the nodes in the cluster, leading to error - "staging failed
on the host" - on the other hosts, for any volume related operation, as that
node is unaware of the new hostname or IP

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
3.7.8

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Create 3 gluster nodes with 2 network interfaces and each of them connected
to different (isolated) network
2. Form a gluster cluster with 2 gluster nodes by peer probing with one set of
IP ( from network1 )
3. Probe the node2 ( from node1 ) with IP ( from network2 )
4. Check peer status on both the nodes
5. From node1, peer probe node3 with IP from network1
6. From node1, peer probe node3 with IP from network2

Actual results:
---------------
Peer status on node2 doesn't get updated with other name of node3

Expected results:
-----------------
Peer information should be consistent/updated across all the nodes in the
cluster

--- Additional comment from SATHEESARAN on 2016-03-03 18:45:17 IST ---

Peer status on 2 nodes
-----------------------
[root at data-node1 ~]# gluster peer status
Number of Peers: 1

Hostname: mgmt-node2.lab.eng.blr.redhat.com
Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e
State: Peer in Cluster (Connected)
Other names:
data-node2.lab.eng.blr.redhat.com
mgmt-node2

[root at data-node2 ~]# gluster peer status
Number of Peers: 1

Hostname: mgmt-node1.lab.eng.blr.redhat.com
Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4
State: Peer in Cluster (Connected)
Other names:
data-node1.lab.eng.blr.redhat.com

Peer status on 3 nodes after probing node3 with network1
---------------------------------------------------------
[root at data-node1 ~]# gluster peer status
Number of Peers: 2

Hostname: mgmt-node2.lab.eng.blr.redhat.com
Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e
State: Peer in Cluster (Connected)
Other names:
data-node2.lab.eng.blr.redhat.com
mgmt-node2

Hostname: mgmt-node3.lab.eng.blr.redhat.com
Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710
State: Peer in Cluster (Connected)

[root at data-node2 ~]# gluster peer status
Number of Peers: 2

Hostname: mgmt-node1.lab.eng.blr.redhat.com
Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4
State: Peer in Cluster (Connected)
Other names:
data-node1.lab.eng.blr.redhat.com

Hostname: mgmt-node3.lab.eng.blr.redhat.com
Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710
State: Peer in Cluster (Connected)

[root at localhost ~]# gluster peer status
Number of Peers: 2

Hostname: mgmt-node1.lab.eng.blr.redhat.com
Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4
State: Peer in Cluster (Connected)
Other names:
data-node1.lab.eng.blr.redhat.com

Hostname: mgmt-node2.lab.eng.blr.redhat.com
Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e
State: Peer in Cluster (Connected)
Other names:
data-node2.lab.eng.blr.redhat.com
mgmt-node2


Peer status on 3 nodes after probing node3 with network2
---------------------------------------------------------
[root at data-node1 ~]# gluster peer probe data-node3.lab.eng.blr.redhat.com
peer probe: success. Host data-node3.lab.eng.blr.redhat.com port 24007 already
in peer list

[root at data-node1 ~]# gluster peer status
Number of Peers: 2

Hostname: mgmt-node2.lab.eng.blr.redhat.com
Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e
State: Peer in Cluster (Connected)
Other names:
data-node2.lab.eng.blr.redhat.com
mgmt-node2

Hostname: mgmt-node3.lab.eng.blr.redhat.com
Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710
State: Peer in Cluster (Connected)
Other names:
data-node3.lab.eng.blr.redhat.com  <--- other name updated in node1

[root at data-node2 ~]# gluster pe s
Number of Peers: 2

Hostname: mgmt-node1.lab.eng.blr.redhat.com
Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4
State: Peer in Cluster (Connected)
Other names:
data-node1.lab.eng.blr.redhat.com

Hostname: mgmt-node3.lab.eng.blr.redhat.com <---not updated with other name
Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710
State: Peer in Cluster (Connected)

[root at localhost ~]# gluster peer status
Number of Peers: 2

Hostname: mgmt-node1.lab.eng.blr.redhat.com
Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4
State: Peer in Cluster (Connected)
Other names:
data-node1.lab.eng.blr.redhat.com

Hostname: mgmt-node2.lab.eng.blr.redhat.com
Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e
State: Peer in Cluster (Connected)
Other names:
data-node2.lab.eng.blr.redhat.com
mgmt-node

[root at data-node1 ~]# gluster volume create testvol
data-node3.lab.eng.blr.redhat.com:/rhs/brick1/brc1
volume create: testvol: failed: Staging failed on
mgmt-node2.lab.eng.blr.redhat.com. Error: Host
data-node3.lab.eng.blr.redhat.com is not in 'Peer in Cluster' state

Error messages in glusterd log in node1 - 
<snip>
[2016-03-03 18:40:38.034436] I [MSGID: 106487]
[glusterd-handler.c:1411:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
[2016-03-03 18:45:20.723287] E [MSGID: 106452]
[glusterd-utils.c:5735:glusterd_new_brick_validate] 0-management: Host
data-node3.lab.eng.blr.redhat.com is not in 'Peer in Cluster' state
[2016-03-03 18:45:20.723323] E [MSGID: 106536]
[glusterd-volume-ops.c:1336:glusterd_op_stage_create_volume] 0-management: Host
data-node3.lab.eng.blr.redhat.com is not in 'Peer in Cluster' state
[2016-03-03 18:45:20.723338] E [MSGID: 106301]
[glusterd-op-sm.c:5241:glusterd_op_ac_stage_op] 0-management: Stage failed on
operation 'Volume Create', Status : -1
</snip>

--- Additional comment from Vijay Bellur on 2016-03-23 14:10:18 IST ---

REVIEW: http://review.gluster.org/13817 (glusterd: Add a new event to handle
multi-net probes) posted (#1) for review on master by Kaushal M
(kaushal at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1314366
[Bug 1314366] Peer information is not propagated to all the nodes in the
cluster, when the peer is probed with its second interface FQDN/IP
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=lXKbPoQQ2M&a=cc_unsubscribe


More information about the Bugs mailing list