[Gluster-users] RDMA Problems with GlusterFS 3.1.1

Jeremy Stout stout.jeremy at gmail.com
Thu Dec 2 18:05:19 UTC 2010


An another follow-up, I tested several compilations today with
different values for send/receive count. I found the maximum value I
could use for both variables was 127. With a value of 127, GlusterFS
did not produce any errors. However, when I changed the value back to
128, the RDMA errors appeared again.

I also tried setting soft/hard "memlock" to unlimited in the
limits.conf file, but still ran into RDMA errors on the client side
when the count variables were set to 128.

On Thu, Dec 2, 2010 at 9:04 AM, Jeremy Stout <stout.jeremy at gmail.com> wrote:
> Thank you for the response. I've been testing GlusterFS 3.1.1 on two
> different OpenSUSE 11.3 systems. Since both systems generated the same
> error messages, I'll include the output for both.
>
> System #1:
> fs-1:~ # cat /proc/meminfo
> MemTotal:       16468756 kB
> MemFree:        16126680 kB
> Buffers:           15680 kB
> Cached:           155860 kB
> SwapCached:            0 kB
> Active:            65228 kB
> Inactive:         123100 kB
> Active(anon):      18632 kB
> Inactive(anon):       48 kB
> Active(file):      46596 kB
> Inactive(file):   123052 kB
> Unevictable:        1988 kB
> Mlocked:            1988 kB
> SwapTotal:             0 kB
> SwapFree:              0 kB
> Dirty:             30072 kB
> Writeback:             4 kB
> AnonPages:         18780 kB
> Mapped:            12136 kB
> Shmem:               220 kB
> Slab:              39592 kB
> SReclaimable:      13108 kB
> SUnreclaim:        26484 kB
> KernelStack:        2360 kB
> PageTables:         2036 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:     8234376 kB
> Committed_AS:     107304 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      314316 kB
> VmallocChunk:   34349860776 kB
> HardwareCorrupted:     0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:        9856 kB
> DirectMap2M:     3135488 kB
> DirectMap1G:    13631488 kB
>
> fs-1:~ # uname -a
> Linux fs-1 2.6.32.25-November2010 #2 SMP PREEMPT Mon Nov 1 15:19:55
> EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>
> fs-1:~ # ulimit -l
> 64
>
> System #2:
> submit-1:~ # cat /proc/meminfo
> MemTotal:       16470424 kB
> MemFree:        16197292 kB
> Buffers:           11788 kB
> Cached:            85492 kB
> SwapCached:            0 kB
> Active:            39120 kB
> Inactive:          76548 kB
> Active(anon):      18532 kB
> Inactive(anon):       48 kB
> Active(file):      20588 kB
> Inactive(file):    76500 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:      67100656 kB
> SwapFree:       67100656 kB
> Dirty:                24 kB
> Writeback:             0 kB
> AnonPages:         18408 kB
> Mapped:            11644 kB
> Shmem:               184 kB
> Slab:              34000 kB
> SReclaimable:       8512 kB
> SUnreclaim:        25488 kB
> KernelStack:        2160 kB
> PageTables:         1952 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    75335868 kB
> Committed_AS:     105620 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:       76416 kB
> VmallocChunk:   34359652640 kB
> HardwareCorrupted:     0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:        7488 kB
> DirectMap2M:    16769024 kB
>
> submit-1:~ # uname -a
> Linux submit-1 2.6.33.7-November2010 #1 SMP PREEMPT Mon Nov 8 13:49:00
> EST 2010 x86_64 x86_64 x86_64 GNU/Linux
>
> submit-1:~ # ulimit -l
> 64
>
> I retrieved the memory information on each machine after starting the
> glusterd process.
>
> On Thu, Dec 2, 2010 at 3:51 AM, Raghavendra G <raghavendra at gluster.com> wrote:
>> Hi Jeremy,
>>
>> can you also get the output of,
>>
>> #uname -a
>>
>> #ulimit -l
>>
>> regards,
>> ----- Original Message -----
>> From: "Raghavendra G" <raghavendra at gluster.com>
>> To: "Jeremy Stout" <stout.jeremy at gmail.com>
>> Cc: gluster-users at gluster.org
>> Sent: Thursday, December 2, 2010 10:20:04 AM
>> Subject: Re: [Gluster-users] RDMA Problems with GlusterFS 3.1.1
>>
>> Hi Jeremy,
>>
>> In order to diagnoise why completion queue creation is failing (as indicated by logs), we want to know what was the free memory available in your system when glusterfs was started.
>>
>> regards,
>> ----- Original Message -----
>> From: "Raghavendra G" <raghavendra at gluster.com>
>> To: "Jeremy Stout" <stout.jeremy at gmail.com>
>> Cc: gluster-users at gluster.org
>> Sent: Thursday, December 2, 2010 10:11:18 AM
>> Subject: Re: [Gluster-users] RDMA Problems with GlusterFS 3.1.1
>>
>> Hi Jeremy,
>>
>> Yes, there might be some performance decrease. But, it should not affect working of rdma.
>>
>> regards,
>> ----- Original Message -----
>> From: "Jeremy Stout" <stout.jeremy at gmail.com>
>> To: gluster-users at gluster.org
>> Sent: Thursday, December 2, 2010 8:30:20 AM
>> Subject: Re: [Gluster-users] RDMA Problems with GlusterFS 3.1.1
>>
>> As an update to my situation, I think I have GlusterFS 3.1.1 working
>> now. I was able to create and mount RDMA volumes without any errors.
>>
>> To fix the problem, I had to make the following changes on lines 3562
>> and 3563 in rdma.c:
>> options->send_count = 32;
>> options->recv_count = 32;
>>
>> The values were set to 128.
>>
>> I'll run some tests tomorrow to verify that it is working correctly.
>> Assuming it does, what would be the expected side-effect of changing
>> the values from 128 to 32? Will there be a decrease in performance?
>>
>>
>> On Wed, Dec 1, 2010 at 10:07 AM, Jeremy Stout <stout.jeremy at gmail.com> wrote:
>>> Here are the results of the test:
>>> submit-1:/usr/local/glusterfs/3.1.1/var/log/glusterfs # ibv_srq_pingpong
>>>  local address:  LID 0x0002, QPN 0x000406, PSN 0x703b96, GID ::
>>>  local address:  LID 0x0002, QPN 0x000407, PSN 0x618cc8, GID ::
>>>  local address:  LID 0x0002, QPN 0x000408, PSN 0xd62272, GID ::
>>>  local address:  LID 0x0002, QPN 0x000409, PSN 0x5db5d9, GID ::
>>>  local address:  LID 0x0002, QPN 0x00040a, PSN 0xc51978, GID ::
>>>  local address:  LID 0x0002, QPN 0x00040b, PSN 0x05fd7a, GID ::
>>>  local address:  LID 0x0002, QPN 0x00040c, PSN 0xaa4a51, GID ::
>>>  local address:  LID 0x0002, QPN 0x00040d, PSN 0xb7a676, GID ::
>>>  local address:  LID 0x0002, QPN 0x00040e, PSN 0x56bde2, GID ::
>>>  local address:  LID 0x0002, QPN 0x00040f, PSN 0xa662bc, GID ::
>>>  local address:  LID 0x0002, QPN 0x000410, PSN 0xee27b0, GID ::
>>>  local address:  LID 0x0002, QPN 0x000411, PSN 0x89c683, GID ::
>>>  local address:  LID 0x0002, QPN 0x000412, PSN 0xd025b3, GID ::
>>>  local address:  LID 0x0002, QPN 0x000413, PSN 0xcec8e4, GID ::
>>>  local address:  LID 0x0002, QPN 0x000414, PSN 0x37e5d2, GID ::
>>>  local address:  LID 0x0002, QPN 0x000415, PSN 0x29562e, GID ::
>>>  remote address: LID 0x000b, QPN 0x000406, PSN 0x3b644e, GID ::
>>>  remote address: LID 0x000b, QPN 0x000407, PSN 0x173320, GID ::
>>>  remote address: LID 0x000b, QPN 0x000408, PSN 0xc105ea, GID ::
>>>  remote address: LID 0x000b, QPN 0x000409, PSN 0x5e5ff1, GID ::
>>>  remote address: LID 0x000b, QPN 0x00040a, PSN 0xff15b0, GID ::
>>>  remote address: LID 0x000b, QPN 0x00040b, PSN 0xf0b152, GID ::
>>>  remote address: LID 0x000b, QPN 0x00040c, PSN 0x4ced49, GID ::
>>>  remote address: LID 0x000b, QPN 0x00040d, PSN 0x01da0e, GID ::
>>>  remote address: LID 0x000b, QPN 0x00040e, PSN 0x69459a, GID ::
>>>  remote address: LID 0x000b, QPN 0x00040f, PSN 0x197c14, GID ::
>>>  remote address: LID 0x000b, QPN 0x000410, PSN 0xd50228, GID ::
>>>  remote address: LID 0x000b, QPN 0x000411, PSN 0xbc9b9b, GID ::
>>>  remote address: LID 0x000b, QPN 0x000412, PSN 0x0870eb, GID ::
>>>  remote address: LID 0x000b, QPN 0x000413, PSN 0xfb1fbc, GID ::
>>>  remote address: LID 0x000b, QPN 0x000414, PSN 0x3eefca, GID ::
>>>  remote address: LID 0x000b, QPN 0x000415, PSN 0xbd64c6, GID ::
>>> 8192000 bytes in 0.01 seconds = 5917.47 Mbit/sec
>>> 1000 iters in 0.01 seconds = 11.07 usec/iter
>>>
>>> fs-1:/usr/local/glusterfs/3.1.1/var/log/glusterfs # ibv_srq_pingpong submit-1
>>>  local address:  LID 0x000b, QPN 0x000406, PSN 0x3b644e, GID ::
>>>  local address:  LID 0x000b, QPN 0x000407, PSN 0x173320, GID ::
>>>  local address:  LID 0x000b, QPN 0x000408, PSN 0xc105ea, GID ::
>>>  local address:  LID 0x000b, QPN 0x000409, PSN 0x5e5ff1, GID ::
>>>  local address:  LID 0x000b, QPN 0x00040a, PSN 0xff15b0, GID ::
>>>  local address:  LID 0x000b, QPN 0x00040b, PSN 0xf0b152, GID ::
>>>  local address:  LID 0x000b, QPN 0x00040c, PSN 0x4ced49, GID ::
>>>  local address:  LID 0x000b, QPN 0x00040d, PSN 0x01da0e, GID ::
>>>  local address:  LID 0x000b, QPN 0x00040e, PSN 0x69459a, GID ::
>>>  local address:  LID 0x000b, QPN 0x00040f, PSN 0x197c14, GID ::
>>>  local address:  LID 0x000b, QPN 0x000410, PSN 0xd50228, GID ::
>>>  local address:  LID 0x000b, QPN 0x000411, PSN 0xbc9b9b, GID ::
>>>  local address:  LID 0x000b, QPN 0x000412, PSN 0x0870eb, GID ::
>>>  local address:  LID 0x000b, QPN 0x000413, PSN 0xfb1fbc, GID ::
>>>  local address:  LID 0x000b, QPN 0x000414, PSN 0x3eefca, GID ::
>>>  local address:  LID 0x000b, QPN 0x000415, PSN 0xbd64c6, GID ::
>>>  remote address: LID 0x0002, QPN 0x000406, PSN 0x703b96, GID ::
>>>  remote address: LID 0x0002, QPN 0x000407, PSN 0x618cc8, GID ::
>>>  remote address: LID 0x0002, QPN 0x000408, PSN 0xd62272, GID ::
>>>  remote address: LID 0x0002, QPN 0x000409, PSN 0x5db5d9, GID ::
>>>  remote address: LID 0x0002, QPN 0x00040a, PSN 0xc51978, GID ::
>>>  remote address: LID 0x0002, QPN 0x00040b, PSN 0x05fd7a, GID ::
>>>  remote address: LID 0x0002, QPN 0x00040c, PSN 0xaa4a51, GID ::
>>>  remote address: LID 0x0002, QPN 0x00040d, PSN 0xb7a676, GID ::
>>>  remote address: LID 0x0002, QPN 0x00040e, PSN 0x56bde2, GID ::
>>>  remote address: LID 0x0002, QPN 0x00040f, PSN 0xa662bc, GID ::
>>>  remote address: LID 0x0002, QPN 0x000410, PSN 0xee27b0, GID ::
>>>  remote address: LID 0x0002, QPN 0x000411, PSN 0x89c683, GID ::
>>>  remote address: LID 0x0002, QPN 0x000412, PSN 0xd025b3, GID ::
>>>  remote address: LID 0x0002, QPN 0x000413, PSN 0xcec8e4, GID ::
>>>  remote address: LID 0x0002, QPN 0x000414, PSN 0x37e5d2, GID ::
>>>  remote address: LID 0x0002, QPN 0x000415, PSN 0x29562e, GID ::
>>> 8192000 bytes in 0.01 seconds = 7423.65 Mbit/sec
>>> 1000 iters in 0.01 seconds = 8.83 usec/iter
>>>
>>> Based on the output, I believe it ran correctly.
>>>
>>> On Wed, Dec 1, 2010 at 9:51 AM, Anand Avati <anand.avati at gmail.com> wrote:
>>>> Can you verify that ibv_srq_pingpong works from the server where this log
>>>> file is from?
>>>>
>>>> Thanks,
>>>> Avati
>>>>
>>>> On Wed, Dec 1, 2010 at 7:44 PM, Jeremy Stout <stout.jeremy at gmail.com> wrote:
>>>>>
>>>>> Whenever I try to start or mount a GlusterFS 3.1.1 volume that uses
>>>>> RDMA, I'm seeing the following error messages in the log file on the
>>>>> server:
>>>>> [2010-11-30 18:37:53.51270] I [nfs.c:652:init] nfs: NFS service started
>>>>> [2010-11-30 18:37:53.51362] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>> [2010-11-30 18:37:53.51375] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>> [2010-11-30 18:37:53.59628] E [rdma.c:2066:rdma_create_cq]
>>>>> rpc-transport/rdma: testdir-client-0: creation of send_cq failed
>>>>> [2010-11-30 18:37:53.59851] E [rdma.c:3771:rdma_get_device]
>>>>> rpc-transport/rdma: testdir-client-0: could not create CQ
>>>>> [2010-11-30 18:37:53.59925] E [rdma.c:3957:rdma_init]
>>>>> rpc-transport/rdma: could not create rdma device for mthca0
>>>>> [2010-11-30 18:37:53.60009] E [rdma.c:4789:init] testdir-client-0:
>>>>> Failed to initialize IB Device
>>>>> [2010-11-30 18:37:53.60030] E [rpc-transport.c:971:rpc_transport_load]
>>>>> rpc-transport: 'rdma' initialization failed
>>>>>
>>>>> On the client, I see:
>>>>> [2010-11-30 18:43:49.653469] W [io-stats.c:1644:init] testdir:
>>>>> dangling volume. check volfile
>>>>> [2010-11-30 18:43:49.653573] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>> [2010-11-30 18:43:49.653607] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>> [2010-11-30 18:43:49.736275] E [rdma.c:2066:rdma_create_cq]
>>>>> rpc-transport/rdma: testdir-client-0: creation of send_cq failed
>>>>> [2010-11-30 18:43:49.736651] E [rdma.c:3771:rdma_get_device]
>>>>> rpc-transport/rdma: testdir-client-0: could not create CQ
>>>>> [2010-11-30 18:43:49.736689] E [rdma.c:3957:rdma_init]
>>>>> rpc-transport/rdma: could not create rdma device for mthca0
>>>>> [2010-11-30 18:43:49.736805] E [rdma.c:4789:init] testdir-client-0:
>>>>> Failed to initialize IB Device
>>>>> [2010-11-30 18:43:49.736841] E
>>>>> [rpc-transport.c:971:rpc_transport_load] rpc-transport: 'rdma'
>>>>> initialization failed
>>>>>
>>>>> This results in an unsuccessful mount.
>>>>>
>>>>> I created the mount using the following commands:
>>>>> /usr/local/glusterfs/3.1.1/sbin/gluster volume create testdir
>>>>> transport rdma submit-1:/exports
>>>>> /usr/local/glusterfs/3.1.1/sbin/gluster volume start testdir
>>>>>
>>>>> To mount the directory, I use:
>>>>> mount -t glusterfs submit-1:/testdir /mnt/glusterfs
>>>>>
>>>>> I don't think it is an Infiniband problem since GlusterFS 3.0.6 and
>>>>> GlusterFS 3.1.0 worked on the same systems. For GlusterFS 3.1.0, the
>>>>> commands listed above produced no error messages.
>>>>>
>>>>> If anyone can provide help with debugging these error messages, it
>>>>> would be appreciated.
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>
>>>>
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>



More information about the Gluster-users mailing list