[Gluster-devel] Suggestion needed to make use of iobuf_pool as rdma buffer.

Mohammed Rafi K C rkavunga at redhat.com
Wed Jan 14 07:48:17 UTC 2015


On 01/13/2015 08:18 PM, Ben England wrote:
> Rafi,
>
> it totally makes sense to me that you need to pre-allocate i/o buffers that will be used by RDMA, and you don't want to constantly change (i.e. allocate and deallocate) these buffers.  Since a remote RDMA controller can be reading and writing to them, we have to be very careful about deallocating in particular.  So an "arena" of pre-registered RDMA buffers makes perfect sense.
>
> Am I understanding you correctly that io-cache translator is soaking up all the RDMA-related buffers? 

Yes, If a page fault is generated for an inode, then in the call back of the request,we will take a ref on that iobref and will cache that iobufs in a rbtree.


>   How important is io-cache translator to Gluster performance at this point?  Given that FUSE caching is now enabled, it seems to me that io-cache translator would accomplish very little.  Should we have it disabled by default? 

I'm not sure about the performance enhancement given by io-cache



> If so, would that solve your problem?

Yes. Though, theoretically there is a chance for running out of default iobufs, but that is tolerable as long as some one is not holding the iobufs for indefinitely


>
> So how do read-ahead translator and write-behind translator interact with RDMA buffering?

All of this translator will take buffer from the same pool, but they are not holding the buffer for so long.

For reah-ahead, If the next request is not received or if it is not sequential read, read-ahead translator will drop the data. So always that buffers are available for rdma.



Regards
Rafi KC
>
> -ben
>
> ----- Original Message -----
>> From: "Mohammed Rafi K C" <rkavunga at redhat.com>
>> To: gluster-devel at gluster.org
>> Sent: Tuesday, January 13, 2015 9:29:56 AM
>> Subject: [Gluster-devel] Suggestion needed to make use of iobuf_pool as rdma	buffer.
>>
>> Hi All,
>>
>> When using RDMA protocol, we need to register the buffer which is going
>> to send through rdma with rdma device. In fact, it is a costly
>> operation, and a performance killer if it happened in I/O path. So our
>> current plan is to register pre-allocated iobuf_arenas from  iobuf_pool
>> with rdma when rdma is getting initialized. The problem comes when all
>> the iobufs are exhausted, then we need to dynamically allocate new
>> arenas from libglusterfs module. Since it is created in libglusterfs, we
>> can't make a call to rdma from libglusterfs. So we will force to
>> register each of the iobufs from the newly created arenas with rdma in
>> I/O path. If io-cache is turned on in client stack, then all the
>> pre-registred arenas will use by io-cache as cache buffer. so we have to
>> do the registration in rdma for each i/o call for every iobufs,
>> eventually we cannot make use of pre registered arenas.
>>
>> To address the issue, we have two approaches in mind,
>>
>>  1) Register each dynamically created buffers in iobuf by bringing
>> transport layer together with libglusterfs.
>>
>>  2) create a separate buffer for caching and offload the data from the
>> read response to the cache buffer in background.
>>
>> If we could make use of preregister memory for every rdma call, then we
>> will have approximately 20% increment for write and 25% of increment for
>> read.
>>
>> Please give your thoughts to address the issue.
>>
>> Thanks & Regards
>> Rafi KC
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>



More information about the Gluster-devel mailing list