[Gluster-users] GlusterFS peer probe hangs on local area network
Rahul51 S
rahul51.s at tcs.com
Fri Jun 21 15:13:54 UTC 2013
Hi All,
I am trying to "peer probe" on the node in the LAN, but it hangs for a
while and when the command is completed after sometime, it displays the
Uuid as all zero's
Below is the output of peer status on both the nodes
root at typhoon-base-unit0:/var/lib/glusterd> gluster peer status
Number of Peers: 1
Hostname: 172.24.132.1
Port: 24007
Uuid: 00000000-0000-0000-0000-000000000000
State: Establishing Connection (Connected)
root at typhoon-base-unit1:/var/lib/glusterd/peers> gluster peer status
Number of Peers: 1
Hostname: 172.24.132.0
Uuid: 00000000-0000-0000-0000-000000000000
State: Connected to Peer (Connected)
After this, when I try to create a replicated volume it fails with the
error
root at typhoon-base-unit0:/root> gluster volume create testvol replica 2
172.24.132.0:/.krfs/_home 172.24.132.1:/.krfs/_home
volume create: testvol: failed: Failed to find host 172.24.132.1
Please note that this node is a ATCA blade which has multiple ethernet
interfaces. The above failure is occurring when I try to peer probe on a
ethernet interfaces which are connected in a Local Area network.(bond 0 on
both the nodes )
There is one other ethernet interface(front 0) which is connected to the
router for both the nodes. If I peer probe on the other node using this
interface, then peer probe is successful.
Below is the output of the ifconfig command on node 0
root at typhoon-base-unit0:/root> ifconfig
bond0 Link encap:Ethernet HWaddr EC:9E:CD:07:DC:0A
inet addr:172.24.132.0 Bcast:172.24.255.255 Mask:255.255.0.0
inet6 addr: fe80::ee9e:cdff:fe07:dc0a/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:9000 Metric:1
RX packets:290442 errors:0 dropped:0 overruns:0 frame:0
TX packets:309604 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:37604741 (35.8 MiB) TX bytes:33336286 (31.7 MiB)
front0 Link encap:Ethernet HWaddr EC:9E:CD:07:DC:0E
inet addr:172.17.23.117 Bcast:172.17.23.255 Mask:255.255.255.0
inet6 addr: fe80::ee9e:cdff:fe07:dc0e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:800920 errors:0 dropped:0 overruns:0 frame:0
TX packets:333044 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:963382278 (918.7 MiB) TX bytes:45300199 (43.2 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:2724222 errors:0 dropped:0 overruns:0 frame:0
TX packets:2724222 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2918838588 (2.7 GiB) TX bytes:2918838588 (2.7 GiB)
Below is the output of the ifconfig command on node 1
root at typhoon-base-unit1:/root> ifconfig
bond0 Link encap:Ethernet HWaddr EC:9E:CD:08:43:82
inet addr:172.24.132.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::ee9e:cdff:fe08:4382/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:9000 Metric:1
RX packets:3236373 errors:0 dropped:0 overruns:0 frame:0
TX packets:2955309 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:336930560 (321.3 MiB) TX bytes:379249481 (361.6 MiB)
front0 Link encap:Ethernet HWaddr EC:9E:CD:08:43:86
inet addr:172.17.23.119 Bcast:172.17.23.255 Mask:255.255.255.0
inet6 addr: fe80::ee9e:cdff:fe08:4386/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2092076 errors:0 dropped:0 overruns:0 frame:0
TX packets:426900 errors:780 dropped:0 overruns:0 carrier:780
collisions:134320 txqueuelen:1000
RX bytes:1263074540 (1.1 GiB) TX bytes:49395506 (47.1 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:17223778 errors:0 dropped:0 overruns:0 frame:0
TX packets:17223778 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4539999675 (4.2 GiB) TX bytes:4539999675 (4.2 GiB)
I compared the success logs with the failure logs and found that node 1
never makes a transition from "Connected to Peer" to "Peer is connected
and accepted" when I use bond0 interfaces
Could you please shed some light on this.
I am attaching both the sucess logs and failure logs for both the nodes
Regards
Rahul Shrivastava
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130621/3d44dc34/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: node0_failure_logs.txt
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130621/3d44dc34/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: node0_success_logs.txt
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130621/3d44dc34/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: node1_failure_logs.txt
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130621/3d44dc34/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: node1_success_logs.txt
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130621/3d44dc34/attachment-0003.txt>
More information about the Gluster-users
mailing list