[Gluster-users] "tcp connect to failed" messages
Iain Buchanan
iainbuc at gmail.com
Thu May 30 08:24:54 UTC 2013
I'm now able to connect to volumes using rdma when they are created
using "tcp,rdma", but it appears that the data is still being
transferred over ethernet. If I run "ifstat" while running iozone I can
see a lot of data being moved through the ethernet adapter (nothing else
is on the box) and the performance is basically identical to plain
ethernet. When I create the volume with just "rdma" I can't even mount
the volume again (see error below).
I've modified all the volume files, adding the
transport.rdma.listen-port lines after completing set-up (anywhere there
is a "option transport-type rdma" I've added this line - quite a few
places).
volume create storage transport tcp,rdma my_server1:/data/area
volume add-brick storage replica 2 my_server2:/data/area
volume start storage
mount -t glusterfs my_server:storage.rdma /mnt/storage
I'm now back to the error I had before the workaround on tcp,rdma - the
lines are in the config files, and I've restarted the service. I can
see my alterations in the config dumped into the log, but then it
returns to the original errors:
[2013-05-30 09:13:41.871408] E [rdma.c:4604:tcp_connect_finish]
0-storage-client-0: tcp connect to failed (Connection refused)
[2013-05-30 09:13:41.871467] W [rdma.c:4187:gf_rdma_disconnect]
(-->/usr/sbin/glusterfs(main+0x34d) [0x7f563a24c3ed]
(-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f5639de5d17]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5231)
[0x7f5634012231]))) 0-storage-client-0: disconnect called (peer:)
I saw a message on the mailing list that seems to suggest plain "rdma"
doesn't work in the 3.3 series - is this correct?
http://www.gluster.org/pipermail/gluster-users/2013-January/035115.html
(I can run ib_read_bw etc. between the two servers without problems.)
Iain
On 05/30/13 08:11, Iain Buchanan wrote:
> Just to confirm - putting the line "option transport.rdma.listen-port
> 24008" into the first "volume" block in the two files with "rdma" in
> their names under /var/lib/glusterd/vols/<volumename> seems to have
> fixed the issue. I'm now able to mount and I can run iozone on a
> single node. I'll give it a go with two nodes and see if rdma makes
> any difference.
>
> Thanks for your help Joe!
>
> Iain
>
>
> On 05/30/13 07:59, Joe Julian wrote:
>> On 05/29/2013 11:42 PM, Iain Buchanan wrote:
>>> Thanks Joe,
>>>
>>> I tried mounting with the ".rdma" suffix after creating using
>>> "transport tcp,rdma" and I get the same "tcp connect to failed"
>>> messages in the log - I'm using the version from the semiosis PPA at
>>> https://launchpad.net/~semiosis/+archive/ubuntu-glusterfs-3.3
>>>
>>> Looking at the dates on there I don't think it includes this fix.
>>> I've sent the maintainer a message asking if they could update it.
>>> In the meantime would Niels de Vos' workaround work? (Setting
>>> transport.rdma.listen-port to 24008 in the glusterd.vol?)
>>
>> Yes, I had someone else try that and it worked. Remember, any changes
>> made to the volume through the CLI will reset that configuration,
>> causing it to stop working until you edit again.
>>
>>>
>>> Iain
>>>
>>> On 05/30/13 07:06, Joe Julian wrote:
>>>> On 05/29/2013 10:33 PM, Iain Buchanan wrote:
>>>>> Hi,
>>>>>
>>>>> I'm running GlusterFS 3.3.1-ubuntu1~precise9 and I'm having some
>>>>> problems with the "rdma" and "tcp,rdma" options I hope someone can
>>>>> help me with.
>>>>>
>>>>> 1. What does "tcp,rdma" actually do - does it let you mix both
>>>>> types of client? (I did a few tests with iozone and found it gave
>>>>> identical performance to the "tcp".)
>>>>>
>>>>> 2. I can't get "rdma" to work, even in the simplest case with a
>>>>> single node.
>>>>> volume create storage transport transport rdma my_server:/data/area
>>>>> volume start storage
>>>>> mount -t glusterfs my_server:storage /mnt/storage
>>>>>
>>>>> The last line hangs. Looking in /var/log/glusterfs I can see the
>>>>> log for the volume:
>>>>>
>>>>> [2013-05-30 06:24:19.605315] E [rdma.c:4604:tcp_connect_finish]
>>>>> 0-storage-client-0: *tcp connect to failed (Connection refused)*
>>>>> [2013-05-30 06:24:19.605713] W [rdma.c:4187:gf_rdma_disconnect]
>>>>> (-->/usr/sbin/glusterfs(main+0x34d) [0x7f374d38a3ed]
>>>>> (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f374cf23d17]
>>>>> (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5231)
>>>>> [0x7f3743398231]))) 0-storage-client-0: disconnect called (peer:)
>>>>> [2013-05-30 06:24:19.605763] W
>>>>> [rdma.c:4521:gf_rdma_handshake_pollerr]
>>>>> (-->/usr/sbin/glusterfs(main+0x34d) [0x7f374d38a3ed]
>>>>> (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f374cf23d17]
>>>>> (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5150)
>>>>> [0x7f3743398150]))) 0-rpc-transport/rdma: storage-client-0: peer
>>>>> () disconnected, cleaning up
>>>>>
>>>>> This block repeats every few seconds - the line "tcp connect to
>>>>> failed" looks like it has lost the server name somehow?
>>>>>
>>>>> Iain
>>>>>
>>>> If you've installed from the yum repo (http://goo.gl/s077x) that
>>>> shouldn't be happening. kkeithley applied the patch. If not, rdma's
>>>> broken in 3.3.[01]. https://bugzilla.redhat.com/show_bug.cgi?id=849122
>>>>
>>>> To mount via rdma when using tcp,rdma, mount -t glusterfs
>>>> server1:myvol.rdma /mnt/foo
>>>>
>>>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130530/7483191e/attachment.html>
More information about the Gluster-users
mailing list