[Gluster-users] RDMA Problems with GlusterFS 3.1.1
Craig Carl
craig at gluster.com
Fri Dec 3 01:02:59 UTC 2010
Jeremy -
What version of OFED are you running? Would you mind install version
1.5.2 from source? We have seen this resolve several issues of this type.
http://www.openfabrics.org/downloads/OFED/ofed-1.5.2/
Thanks,
Craig
-->
Craig Carl
Senior Systems Engineer
Gluster
On 12/02/2010 10:05 AM, Jeremy Stout wrote:
> An another follow-up, I tested several compilations today with
> different values for send/receive count. I found the maximum value I
> could use for both variables was 127. With a value of 127, GlusterFS
> did not produce any errors. However, when I changed the value back to
> 128, the RDMA errors appeared again.
>
> I also tried setting soft/hard "memlock" to unlimited in the
> limits.conf file, but still ran into RDMA errors on the client side
> when the count variables were set to 128.
>
> On Thu, Dec 2, 2010 at 9:04 AM, Jeremy Stout<stout.jeremy at gmail.com> wrote:
>> Thank you for the response. I've been testing GlusterFS 3.1.1 on two
>> different OpenSUSE 11.3 systems. Since both systems generated the same
>> error messages, I'll include the output for both.
>>
>> System #1:
>> fs-1:~ # cat /proc/meminfo
>> MemTotal: 16468756 kB
>> MemFree: 16126680 kB
>> Buffers: 15680 kB
>> Cached: 155860 kB
>> SwapCached: 0 kB
>> Active: 65228 kB
>> Inactive: 123100 kB
>> Active(anon): 18632 kB
>> Inactive(anon): 48 kB
>> Active(file): 46596 kB
>> Inactive(file): 123052 kB
>> Unevictable: 1988 kB
>> Mlocked: 1988 kB
>> SwapTotal: 0 kB
>> SwapFree: 0 kB
>> Dirty: 30072 kB
>> Writeback: 4 kB
>> AnonPages: 18780 kB
>> Mapped: 12136 kB
>> Shmem: 220 kB
>> Slab: 39592 kB
>> SReclaimable: 13108 kB
>> SUnreclaim: 26484 kB
>> KernelStack: 2360 kB
>> PageTables: 2036 kB
>> NFS_Unstable: 0 kB
>> Bounce: 0 kB
>> WritebackTmp: 0 kB
>> CommitLimit: 8234376 kB
>> Committed_AS: 107304 kB
>> VmallocTotal: 34359738367 kB
>> VmallocUsed: 314316 kB
>> VmallocChunk: 34349860776 kB
>> HardwareCorrupted: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 2048 kB
>> DirectMap4k: 9856 kB
>> DirectMap2M: 3135488 kB
>> DirectMap1G: 13631488 kB
>>
>> fs-1:~ # uname -a
>> Linux fs-1 2.6.32.25-November2010 #2 SMP PREEMPT Mon Nov 1 15:19:55
>> EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>>
>> fs-1:~ # ulimit -l
>> 64
>>
>> System #2:
>> submit-1:~ # cat /proc/meminfo
>> MemTotal: 16470424 kB
>> MemFree: 16197292 kB
>> Buffers: 11788 kB
>> Cached: 85492 kB
>> SwapCached: 0 kB
>> Active: 39120 kB
>> Inactive: 76548 kB
>> Active(anon): 18532 kB
>> Inactive(anon): 48 kB
>> Active(file): 20588 kB
>> Inactive(file): 76500 kB
>> Unevictable: 0 kB
>> Mlocked: 0 kB
>> SwapTotal: 67100656 kB
>> SwapFree: 67100656 kB
>> Dirty: 24 kB
>> Writeback: 0 kB
>> AnonPages: 18408 kB
>> Mapped: 11644 kB
>> Shmem: 184 kB
>> Slab: 34000 kB
>> SReclaimable: 8512 kB
>> SUnreclaim: 25488 kB
>> KernelStack: 2160 kB
>> PageTables: 1952 kB
>> NFS_Unstable: 0 kB
>> Bounce: 0 kB
>> WritebackTmp: 0 kB
>> CommitLimit: 75335868 kB
>> Committed_AS: 105620 kB
>> VmallocTotal: 34359738367 kB
>> VmallocUsed: 76416 kB
>> VmallocChunk: 34359652640 kB
>> HardwareCorrupted: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 2048 kB
>> DirectMap4k: 7488 kB
>> DirectMap2M: 16769024 kB
>>
>> submit-1:~ # uname -a
>> Linux submit-1 2.6.33.7-November2010 #1 SMP PREEMPT Mon Nov 8 13:49:00
>> EST 2010 x86_64 x86_64 x86_64 GNU/Linux
>>
>> submit-1:~ # ulimit -l
>> 64
>>
>> I retrieved the memory information on each machine after starting the
>> glusterd process.
>>
>> On Thu, Dec 2, 2010 at 3:51 AM, Raghavendra G<raghavendra at gluster.com> wrote:
>>> Hi Jeremy,
>>>
>>> can you also get the output of,
>>>
>>> #uname -a
>>>
>>> #ulimit -l
>>>
>>> regards,
>>> ----- Original Message -----
>>> From: "Raghavendra G"<raghavendra at gluster.com>
>>> To: "Jeremy Stout"<stout.jeremy at gmail.com>
>>> Cc: gluster-users at gluster.org
>>> Sent: Thursday, December 2, 2010 10:20:04 AM
>>> Subject: Re: [Gluster-users] RDMA Problems with GlusterFS 3.1.1
>>>
>>> Hi Jeremy,
>>>
>>> In order to diagnoise why completion queue creation is failing (as indicated by logs), we want to know what was the free memory available in your system when glusterfs was started.
>>>
>>> regards,
>>> ----- Original Message -----
>>> From: "Raghavendra G"<raghavendra at gluster.com>
>>> To: "Jeremy Stout"<stout.jeremy at gmail.com>
>>> Cc: gluster-users at gluster.org
>>> Sent: Thursday, December 2, 2010 10:11:18 AM
>>> Subject: Re: [Gluster-users] RDMA Problems with GlusterFS 3.1.1
>>>
>>> Hi Jeremy,
>>>
>>> Yes, there might be some performance decrease. But, it should not affect working of rdma.
>>>
>>> regards,
>>> ----- Original Message -----
>>> From: "Jeremy Stout"<stout.jeremy at gmail.com>
>>> To: gluster-users at gluster.org
>>> Sent: Thursday, December 2, 2010 8:30:20 AM
>>> Subject: Re: [Gluster-users] RDMA Problems with GlusterFS 3.1.1
>>>
>>> As an update to my situation, I think I have GlusterFS 3.1.1 working
>>> now. I was able to create and mount RDMA volumes without any errors.
>>>
>>> To fix the problem, I had to make the following changes on lines 3562
>>> and 3563 in rdma.c:
>>> options->send_count = 32;
>>> options->recv_count = 32;
>>>
>>> The values were set to 128.
>>>
>>> I'll run some tests tomorrow to verify that it is working correctly.
>>> Assuming it does, what would be the expected side-effect of changing
>>> the values from 128 to 32? Will there be a decrease in performance?
>>>
>>>
>>> On Wed, Dec 1, 2010 at 10:07 AM, Jeremy Stout<stout.jeremy at gmail.com> wrote:
>>>> Here are the results of the test:
>>>> submit-1:/usr/local/glusterfs/3.1.1/var/log/glusterfs # ibv_srq_pingpong
>>>> local address: LID 0x0002, QPN 0x000406, PSN 0x703b96, GID ::
>>>> local address: LID 0x0002, QPN 0x000407, PSN 0x618cc8, GID ::
>>>> local address: LID 0x0002, QPN 0x000408, PSN 0xd62272, GID ::
>>>> local address: LID 0x0002, QPN 0x000409, PSN 0x5db5d9, GID ::
>>>> local address: LID 0x0002, QPN 0x00040a, PSN 0xc51978, GID ::
>>>> local address: LID 0x0002, QPN 0x00040b, PSN 0x05fd7a, GID ::
>>>> local address: LID 0x0002, QPN 0x00040c, PSN 0xaa4a51, GID ::
>>>> local address: LID 0x0002, QPN 0x00040d, PSN 0xb7a676, GID ::
>>>> local address: LID 0x0002, QPN 0x00040e, PSN 0x56bde2, GID ::
>>>> local address: LID 0x0002, QPN 0x00040f, PSN 0xa662bc, GID ::
>>>> local address: LID 0x0002, QPN 0x000410, PSN 0xee27b0, GID ::
>>>> local address: LID 0x0002, QPN 0x000411, PSN 0x89c683, GID ::
>>>> local address: LID 0x0002, QPN 0x000412, PSN 0xd025b3, GID ::
>>>> local address: LID 0x0002, QPN 0x000413, PSN 0xcec8e4, GID ::
>>>> local address: LID 0x0002, QPN 0x000414, PSN 0x37e5d2, GID ::
>>>> local address: LID 0x0002, QPN 0x000415, PSN 0x29562e, GID ::
>>>> remote address: LID 0x000b, QPN 0x000406, PSN 0x3b644e, GID ::
>>>> remote address: LID 0x000b, QPN 0x000407, PSN 0x173320, GID ::
>>>> remote address: LID 0x000b, QPN 0x000408, PSN 0xc105ea, GID ::
>>>> remote address: LID 0x000b, QPN 0x000409, PSN 0x5e5ff1, GID ::
>>>> remote address: LID 0x000b, QPN 0x00040a, PSN 0xff15b0, GID ::
>>>> remote address: LID 0x000b, QPN 0x00040b, PSN 0xf0b152, GID ::
>>>> remote address: LID 0x000b, QPN 0x00040c, PSN 0x4ced49, GID ::
>>>> remote address: LID 0x000b, QPN 0x00040d, PSN 0x01da0e, GID ::
>>>> remote address: LID 0x000b, QPN 0x00040e, PSN 0x69459a, GID ::
>>>> remote address: LID 0x000b, QPN 0x00040f, PSN 0x197c14, GID ::
>>>> remote address: LID 0x000b, QPN 0x000410, PSN 0xd50228, GID ::
>>>> remote address: LID 0x000b, QPN 0x000411, PSN 0xbc9b9b, GID ::
>>>> remote address: LID 0x000b, QPN 0x000412, PSN 0x0870eb, GID ::
>>>> remote address: LID 0x000b, QPN 0x000413, PSN 0xfb1fbc, GID ::
>>>> remote address: LID 0x000b, QPN 0x000414, PSN 0x3eefca, GID ::
>>>> remote address: LID 0x000b, QPN 0x000415, PSN 0xbd64c6, GID ::
>>>> 8192000 bytes in 0.01 seconds = 5917.47 Mbit/sec
>>>> 1000 iters in 0.01 seconds = 11.07 usec/iter
>>>>
>>>> fs-1:/usr/local/glusterfs/3.1.1/var/log/glusterfs # ibv_srq_pingpong submit-1
>>>> local address: LID 0x000b, QPN 0x000406, PSN 0x3b644e, GID ::
>>>> local address: LID 0x000b, QPN 0x000407, PSN 0x173320, GID ::
>>>> local address: LID 0x000b, QPN 0x000408, PSN 0xc105ea, GID ::
>>>> local address: LID 0x000b, QPN 0x000409, PSN 0x5e5ff1, GID ::
>>>> local address: LID 0x000b, QPN 0x00040a, PSN 0xff15b0, GID ::
>>>> local address: LID 0x000b, QPN 0x00040b, PSN 0xf0b152, GID ::
>>>> local address: LID 0x000b, QPN 0x00040c, PSN 0x4ced49, GID ::
>>>> local address: LID 0x000b, QPN 0x00040d, PSN 0x01da0e, GID ::
>>>> local address: LID 0x000b, QPN 0x00040e, PSN 0x69459a, GID ::
>>>> local address: LID 0x000b, QPN 0x00040f, PSN 0x197c14, GID ::
>>>> local address: LID 0x000b, QPN 0x000410, PSN 0xd50228, GID ::
>>>> local address: LID 0x000b, QPN 0x000411, PSN 0xbc9b9b, GID ::
>>>> local address: LID 0x000b, QPN 0x000412, PSN 0x0870eb, GID ::
>>>> local address: LID 0x000b, QPN 0x000413, PSN 0xfb1fbc, GID ::
>>>> local address: LID 0x000b, QPN 0x000414, PSN 0x3eefca, GID ::
>>>> local address: LID 0x000b, QPN 0x000415, PSN 0xbd64c6, GID ::
>>>> remote address: LID 0x0002, QPN 0x000406, PSN 0x703b96, GID ::
>>>> remote address: LID 0x0002, QPN 0x000407, PSN 0x618cc8, GID ::
>>>> remote address: LID 0x0002, QPN 0x000408, PSN 0xd62272, GID ::
>>>> remote address: LID 0x0002, QPN 0x000409, PSN 0x5db5d9, GID ::
>>>> remote address: LID 0x0002, QPN 0x00040a, PSN 0xc51978, GID ::
>>>> remote address: LID 0x0002, QPN 0x00040b, PSN 0x05fd7a, GID ::
>>>> remote address: LID 0x0002, QPN 0x00040c, PSN 0xaa4a51, GID ::
>>>> remote address: LID 0x0002, QPN 0x00040d, PSN 0xb7a676, GID ::
>>>> remote address: LID 0x0002, QPN 0x00040e, PSN 0x56bde2, GID ::
>>>> remote address: LID 0x0002, QPN 0x00040f, PSN 0xa662bc, GID ::
>>>> remote address: LID 0x0002, QPN 0x000410, PSN 0xee27b0, GID ::
>>>> remote address: LID 0x0002, QPN 0x000411, PSN 0x89c683, GID ::
>>>> remote address: LID 0x0002, QPN 0x000412, PSN 0xd025b3, GID ::
>>>> remote address: LID 0x0002, QPN 0x000413, PSN 0xcec8e4, GID ::
>>>> remote address: LID 0x0002, QPN 0x000414, PSN 0x37e5d2, GID ::
>>>> remote address: LID 0x0002, QPN 0x000415, PSN 0x29562e, GID ::
>>>> 8192000 bytes in 0.01 seconds = 7423.65 Mbit/sec
>>>> 1000 iters in 0.01 seconds = 8.83 usec/iter
>>>>
>>>> Based on the output, I believe it ran correctly.
>>>>
>>>> On Wed, Dec 1, 2010 at 9:51 AM, Anand Avati<anand.avati at gmail.com> wrote:
>>>>> Can you verify that ibv_srq_pingpong works from the server where this log
>>>>> file is from?
>>>>>
>>>>> Thanks,
>>>>> Avati
>>>>>
>>>>> On Wed, Dec 1, 2010 at 7:44 PM, Jeremy Stout<stout.jeremy at gmail.com> wrote:
>>>>>> Whenever I try to start or mount a GlusterFS 3.1.1 volume that uses
>>>>>> RDMA, I'm seeing the following error messages in the log file on the
>>>>>> server:
>>>>>> [2010-11-30 18:37:53.51270] I [nfs.c:652:init] nfs: NFS service started
>>>>>> [2010-11-30 18:37:53.51362] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>>> [2010-11-30 18:37:53.51375] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>>> [2010-11-30 18:37:53.59628] E [rdma.c:2066:rdma_create_cq]
>>>>>> rpc-transport/rdma: testdir-client-0: creation of send_cq failed
>>>>>> [2010-11-30 18:37:53.59851] E [rdma.c:3771:rdma_get_device]
>>>>>> rpc-transport/rdma: testdir-client-0: could not create CQ
>>>>>> [2010-11-30 18:37:53.59925] E [rdma.c:3957:rdma_init]
>>>>>> rpc-transport/rdma: could not create rdma device for mthca0
>>>>>> [2010-11-30 18:37:53.60009] E [rdma.c:4789:init] testdir-client-0:
>>>>>> Failed to initialize IB Device
>>>>>> [2010-11-30 18:37:53.60030] E [rpc-transport.c:971:rpc_transport_load]
>>>>>> rpc-transport: 'rdma' initialization failed
>>>>>>
>>>>>> On the client, I see:
>>>>>> [2010-11-30 18:43:49.653469] W [io-stats.c:1644:init] testdir:
>>>>>> dangling volume. check volfile
>>>>>> [2010-11-30 18:43:49.653573] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>>> [2010-11-30 18:43:49.653607] W [dict.c:1204:data_to_str] dict: @data=(nil)
>>>>>> [2010-11-30 18:43:49.736275] E [rdma.c:2066:rdma_create_cq]
>>>>>> rpc-transport/rdma: testdir-client-0: creation of send_cq failed
>>>>>> [2010-11-30 18:43:49.736651] E [rdma.c:3771:rdma_get_device]
>>>>>> rpc-transport/rdma: testdir-client-0: could not create CQ
>>>>>> [2010-11-30 18:43:49.736689] E [rdma.c:3957:rdma_init]
>>>>>> rpc-transport/rdma: could not create rdma device for mthca0
>>>>>> [2010-11-30 18:43:49.736805] E [rdma.c:4789:init] testdir-client-0:
>>>>>> Failed to initialize IB Device
>>>>>> [2010-11-30 18:43:49.736841] E
>>>>>> [rpc-transport.c:971:rpc_transport_load] rpc-transport: 'rdma'
>>>>>> initialization failed
>>>>>>
>>>>>> This results in an unsuccessful mount.
>>>>>>
>>>>>> I created the mount using the following commands:
>>>>>> /usr/local/glusterfs/3.1.1/sbin/gluster volume create testdir
>>>>>> transport rdma submit-1:/exports
>>>>>> /usr/local/glusterfs/3.1.1/sbin/gluster volume start testdir
>>>>>>
>>>>>> To mount the directory, I use:
>>>>>> mount -t glusterfs submit-1:/testdir /mnt/glusterfs
>>>>>>
>>>>>> I don't think it is an Infiniband problem since GlusterFS 3.0.6 and
>>>>>> GlusterFS 3.1.0 worked on the same systems. For GlusterFS 3.1.0, the
>>>>>> commands listed above produced no error messages.
>>>>>>
>>>>>> If anyone can provide help with debugging these error messages, it
>>>>>> would be appreciated.
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list