[Gluster-devel] Gluster 3.6.2 On Xeon Phi
Rudra Siva
rudrasiva11 at gmail.com
Sun Feb 8 14:22:31 UTC 2015
Thanks for trying and sending the changes - finally got it all working
... it turned out to be a problem with my changes (in
gf_rdma_post_unref - goes back to lack of SRQ on the interface)
You may be able to simulate the crash if you set volume parameters to
something like the following (it would be purely academic):
gluster volume set data_volume diagnostics.brick-log-level TRACE
gluster volume set data_volume diagnostics.client-log-level TRACE
Had those because stuff began from communication problems (queue size,
lack of SRQ) so things have come a long way from there - will test for
some more time and make my small changes available.
The transfer speeds of the default VE (Virtual Ethernet) that Intel
ships with it is ~6 MB/sec - presently with Gluster I see around 80
MB/sec on the virtual IB (there is no real infiniband card) and with a
stable gluster mount. The interface benchmarks show it can give 5000
MB/sec so there looks to be more room for improvement - stable gluster
mount is required first though for doing anything.
Questions:
1. ctx is shared between posts - parts of code with locks and without
- intentional/oversight?
2. iobuf_pool->default_page_size = 128 * GF_UNIT_KB - why is 128 KB
chosen and not higher?
-Siva
On Fri, Feb 6, 2015 at 6:12 AM, Mohammed Rafi K C <rkavunga at redhat.com> wrote:
>
> On 02/06/2015 05:31 AM, Rudra Siva wrote:
>> Rafi,
>>
>> Sorry it took me some time - I had to merge these with some of my
>> changes - the scif0 (iWARP) does not support SRQ (max_srq : 0) so have
>> changed some of the code to use QP instead - can provide those if
>> there is interest after this is stable.
>>
>> Here's the good -
>>
>> The performance with the patches is better than without (esp.
>> http://review.gluster.org/#/c/9327/).
> Good to hear. My thought was, http://review.gluster.org/#/c/9506/ will
> give a much better performance than the others :-) . A rebase is needed
> if it is applying on top the other patches.
>
>>
>> The bad - glusterfsd crashes for large files so it's difficult to get
>> some decent benchmark numbers
>
> Thanks for rising the bug. I tried to reproduce the problem on 3.6.2
> version+the four patches with a simple distributed volume. But I
> couldn't reproduce the same, and still trying. (we are using mellanox ib
> cards).
>
> If possible can you please share the volume info and workload used for
> large files.
>
>
>> - small ones look good - trying to
>> understand the patch at this time. Looks like this code comes from
>> 9327 as well.
>>
>> Can you please review the reset of mr_count?
>
> Yes, The problem could be the wrong value in mr_count. And I guess we
> failed to reset the value to zero, so that for some I/O mr_count will be
> incremented couple of times. So the variable might be got overflown. Can
> you apply the patch attached with mail, and try with this.
>
>>
>> Info from gdb is as follows - if you need more or something jumps out
>> please feel free to let me know.
>>
>> (gdb) p *post
>> $16 = {next = 0x7fffe003b280, prev = 0x7fffe0037cc0, mr =
>> 0x7fffe0037fb0, buf = 0x7fffe0096000 "\005\004", buf_size = 4096, aux
>> = 0 '\000',
>> reused = 1, device = 0x7fffe00019c0, type = GF_RDMA_RECV_POST, ctx =
>> {mr = {0x7fffe0003020, 0x7fffc8005f20, 0x7fffc8000aa0, 0x7fffc80030c0,
>> 0x7fffc8002d70, 0x7fffc8008bb0, 0x7fffc8008bf0, 0x7fffc8002cd0},
>> mr_count = -939493456, vector = {{iov_base = 0x7ffff7fd6000,
>> iov_len = 112}, {iov_base = 0x7fffbf140000, iov_len = 131072},
>> {iov_base = 0x0, iov_len = 0} <repeats 14 times>}, count = 2,
>> iobref = 0x7fffc8001670, hdr_iobuf = 0x61d710, is_request = 0
>> '\000', gf_rdma_reads = 1, reply_info = 0x0}, refcount = 1, lock = {
>> __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
>> __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
>> __size = '\000' <repeats 39 times>, __align = 0}}
>>
>> (gdb) bt
>> #0 0x00007fffe7142681 in __gf_rdma_register_local_mr_for_rdma
>> (peer=0x7fffe0001800, vector=0x7fffe003b108, count=1,
>> ctx=0x7fffe003b0b0)
>> at rdma.c:2255
>> #1 0x00007fffe7145acd in gf_rdma_do_reads (peer=0x7fffe0001800,
>> post=0x7fffe003b070, readch=0x7fffe0096010) at rdma.c:3609
>> #2 0x00007fffe714656e in gf_rdma_recv_request (peer=0x7fffe0001800,
>> post=0x7fffe003b070, readch=0x7fffe0096010) at rdma.c:3859
>> #3 0x00007fffe714691d in gf_rdma_process_recv (peer=0x7fffe0001800,
>> wc=0x7fffceffcd20) at rdma.c:3967
>> #4 0x00007fffe7146e7d in gf_rdma_recv_completion_proc
>> (data=0x7fffe0002b30) at rdma.c:4114
>> #5 0x00007ffff72cfdf3 in start_thread () from /lib64/libpthread.so.0
>> #6 0x00007ffff6c403dd in clone () from /lib64/libc.so.6
>>
>> On Fri, Jan 30, 2015 at 7:11 AM, Mohammed Rafi K C <rkavunga at redhat.com> wrote:
>>> On 01/29/2015 06:13 PM, Rudra Siva wrote:
>>>> Hi,
>>>>
>>>> Have been able to get Gluster running on Intel's MIC platform. The
>>>> only code change to Gluster source was an unresolved yylex (I am not
>>>> really sure why that was coming up - may be someone more familiar with
>>>> it's use in Gluster can answer).
>>>>
>>>> At the step for compiling the binaries (glusterd, glusterfsd,
>>>> glusterfs, glfsheal) build breaks with an unresolved yylex error.
>>>>
>>>> For now have a routine yylex that simply calls graphyylex - I don't
>>>> know if this is even correct however mount functions.
>>>>
>>>> GCC - 4.7 (it's an oddity, latest GCC is missing the Phi patches)
>>>>
>>>> flex --version
>>>> flex 2.5.39
>>>>
>>>> bison --version
>>>> bison (GNU Bison) 3.0
>>>>
>>>> I'm still working on testing the RDMA and Infiniband support and can
>>>> make notes, numbers available when that is complete.
>>> There are couple of rdma performance related patches under review. If
>>> you could make use of those patches, I hope that will give a performance
>>> enhancement.
>>>
>>> [1] : http://review.gluster.org/#/c/9329/
>>> [2] : http://review.gluster.org/#/c/9321/
>>> [3] : http://review.gluster.org/#/c/9327/
>>> [4] : http://review.gluster.org/#/c/9506/
>>>
>>> Let me know if you need any clarification.
>>>
>>> Regards!
>>> Rafi KC
>>
>>
>
--
-Siva
More information about the Gluster-devel
mailing list