[Gluster-devel] Suggestion needed to make use of iobuf_pool as rdma buffer.

Anand Avati avati at gluster.org
Wed Jan 14 11:18:06 UTC 2015


On Tue Jan 13 2015 at 11:57:53 PM Mohammed Rafi K C <rkavunga at redhat.com>
wrote:

>
> On 01/14/2015 12:11 AM, Anand Avati wrote:
>
> 3) Why not have a separate iobuf pool for RDMA?
>
>
> Since every fops are using the default iobuf_pool, if we go with another
> iobuf_pool dedicated to rdma, we need to copy that buffer from default pool
> to rdma or so, unless we are intelligently allocating the buffers based on
> the transport which we are going to use. It is an extra level copying in
> the I/O path.
>

Not sure what you mean by that. Every fop does not use default iobuf_pool.
Only readv() and writev() do. If you really want to save on memory
registration cost, your first target should be the header buffers (which is
used in every fop, and currently valloc()ed and ibv_reg_mr() per call).
Making headers use an iobuf pool where every arena is registered during
arena creation and destruction will get you the highest overhead savings.

Coming to file data iobufs, today iobuf pools are used in a "mixed" way,
i.e, they hold both data being actively transferred/under IO, and also data
which is being held long term (cached by io-cache). io-cache just does an
iobuf_ref() and holds on to the data. This avoids memory copies in io-cache
layer. However that may be something we want to reconsider: io-cache could
use its own iobuf pool into which data is copied into from the transfer
iobuf (which is pre-registered with RDMA in bulk etc.)

Thanks




>
>
> On Tue Jan 13 2015 at 6:30:09 AM Mohammed Rafi K C <rkavunga at redhat.com>
> wrote:
>
>> Hi All,
>>
>> When using RDMA protocol, we need to register the buffer which is going
>> to send through rdma with rdma device. In fact, it is a costly
>> operation, and a performance killer if it happened in I/O path. So our
>> current plan is to register pre-allocated iobuf_arenas from  iobuf_pool
>> with rdma when rdma is getting initialized. The problem comes when all
>> the iobufs are exhausted, then we need to dynamically allocate new
>> arenas from libglusterfs module. Since it is created in libglusterfs, we
>> can't make a call to rdma from libglusterfs. So we will force to
>> register each of the iobufs from the newly created arenas with rdma in
>> I/O path. If io-cache is turned on in client stack, then all the
>> pre-registred arenas will use by io-cache as cache buffer. so we have to
>> do the registration in rdma for each i/o call for every iobufs,
>> eventually we cannot make use of pre registered arenas.
>>
>> To address the issue, we have two approaches in mind,
>>
>>  1) Register each dynamically created buffers in iobuf by bringing
>> transport layer together with libglusterfs.
>>
>>  2) create a separate buffer for caching and offload the data from the
>> read response to the cache buffer in background.
>>
>> If we could make use of preregister memory for every rdma call, then we
>> will have approximately 20% increment for write and 25% of increment for
>> read.
>>
>> Please give your thoughts to address the issue.
>>
>> Thanks & Regards
>> Rafi KC
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150114/c8d7b0e5/attachment.html>


More information about the Gluster-devel mailing list