[Gluster-users] RDMA transport problems in GLUSTER on host with MIC

Mohammed Rafi K C rkavunga at redhat.com
Fri Jan 20 13:08:53 UTC 2017


One thing to note here is that rdma uses srq, which I see as disabled in
both devices.


Regards
Rafi KC

On 01/20/2017 05:05 PM, Anoop C S wrote:
> On Fri, 2017-01-20 at 11:53 +0100, Fedele Stabile wrote:
>> Thank you for your help, 
>> I will answer to your questions:
>>
>> Il giorno ven, 20/01/2017 alle 12.58 +0530, Anoop C S ha scritto:
>>> On Wed, 2017-01-18 at 12:56 +0100, Fedele Stabile wrote:
>>>> Hi,
>>>> it happens that RDMA gluster transport does not works anymore
>>>> after I have configured ibscif virtual connector for Infiniband in
>>>> a
>>>> server with a XeonPHI coprocessor.
>>>>
>>>> I have CentOS 6.6 and GLUSTER 3.8.5, OFED 3.12-1 MPSS 3.5.2 and I
>>>> have
>>>> followed the installation instructions of MPSS_Users_Guide
>>>> (Revision
>>>> 3.5) that suggested to remove
>>>> compat-rdma-devel and compat-rdma packages.
>>>>
>>> It would help if you could somehow clearly understand the reason for
>>> removing those packages. May be
>>> they are critical and not intended to be removed. Please ask for help
>>> from OFED.
>>>
>> Files of ackages compat-rdma-devel and compat-rdma are substituted by
>> others from MPSS package that contains all the Software Stack for
>> server and MIC card including ofed drivers.
>>
>>>> I have noticed that running the command:
>>>> ib_send_bw
>>>> gives the following error:
>>>>
>>>> # ib_send_bw
>>>>
>>>> ************************************
>>>> * Waiting for client to connect... *
>>>> ************************************
>>>> -----------------------------------------------------------------
>>>> ----
>>>> ------------------
>>>>                     Send BW Test
>>>>  Dual-port       : OFF        Device         : scif0
>>>>  Number of qps   : 1        Transport type : IW
>>>>  Connection type : RC        Using SRQ      : OFF
>>>>  RX depth        : 512
>>>>  CQ Moderation   : 100
>>>>  Mtu             : 2048[B]
>>>>  Link type       : Ethernet
>>>>  Gid index       : 0
>>>>  Max inline data : 0[B]
>>>>  rdma_cm QPs     : OFF
>>>>  Data ex. method : Ethernet
>>>> -----------------------------------------------------------------
>>>> ----
>>>> ------------------
>>>>  local address: LID 0x3e8 QPN 0x0003 PSN 0x123123
>>>>  GID: 76:121:186:102:03:119:00:00:00:00:00:00:00:00:00:00
>>>> ethernet_read_keys: Couldn't read remote address
>>>>  Unable to read to socket/rdam_cm
>>>> Failed to exchange data between server and clients
>>>>
>>> The above error have nothing to do with GlusterFS. Can you please
>>> give more context on what failed
>>> for you while trying out GlusterFS with RDMA transport?
>> In glusterd.vol.log when I start glusterd I see:
>> [rdma.c:4837:gf_rdma_listen] 0-rdma.management: rdma option set failed
>> [Funzione non implementata]
> Sorry, my mistake. It is clear that glusterfs-rdma is installed. I incorrectly interpreted the log
> entry.
>
> rdma_set_option() failed here with ENOSYS which means that some functionality is not present with
> the current setup (I suspect RDMA_OPTION_ID_REUSEADDR). Need to look more on what could be the
> reason. I will get back to you after some more debugging into the code.
>
>> But RDMA is correctly working on qib0 device as you can see below:
>>>> Instead using the output of the command
>>>>
>>>> ib_send_bw -d qib0
>>>>
>>>> gives correct results:
>>>>
>>>> # ib_send_bw -d qib0
>>>>
>>>> ************************************
>>>> * Waiting for client to connect... *
>>>> ************************************
>>>> -----------------------------------------------------------------
>>>> ----
>>>> ------------------
>>>>                     Send BW Test
>>>>  Dual-port       : OFF        Device         : qib0
>>>>  Number of qps   : 1        Transport type : IB
>>>>  Connection type : RC        Using SRQ      : OFF
>>>>  RX depth        : 512
>>>>  CQ Moderation   : 100
>>>>  Mtu             : 2048[B]
>>>>  Link type       : IB
>>>>  Max inline data : 0[B]
>>>>  rdma_cm QPs     : OFF
>>>>  Data ex. method : Ethernet
>>>> -----------------------------------------------------------------
>>>> ----
>>>> ------------------
>>>>  local address: LID 0x0a QPN 0x0169 PSN 0xe0b768
>>>>  remote address: LID 0x20 QPN 0x28b280 PSN 0xc3008c
>>>> -----------------------------------------------------------------
>>>> ----
>>>> ------------------
>>>>  #bytes     #iterations    BW peak[MB/sec]    BW
>>>> average[MB/sec]   MsgRate[Mpps]
>>>>  65536      1000             0.00               2160.87           0
>>>> .034
>>>> 574
>>>> -----------------------------------------------------------------
>>>> ----
>>>> ------------------
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



More information about the Gluster-users mailing list