[Gluster-devel] zero-copy readv

Anand Avati aavati at redhat.com
Thu Jan 10 06:50:09 UTC 2013

On 01/09/2013 10:37 PM, Amar Tumballi wrote:
>> - On the read side things are a little more complicated. In
>> rpc-transport/socket, there is a call to iobuf_get() to create a new
>> iobuf for reading in the readv reply data from the server. We will need
>> a framework changes where, if the readv request (of the xid for which
>> readv reply is being handled) happened to be a "direct" variant (i.e,
>> zero-copy), then the "special iobuf around user's memory" gets picked up
>> and read() from socket is performed directly into user's memory.
>> Similar, but equivalent, changes will have to be done in RDMA
>> (Raghavendra on CC can help). Since the goal is to avoid memory copy,
>> this data will be bypassing io-cache (and purging pre-cached data of
>> those regions along the way).
> On the read side too, our client protocol is designed to handle 0-copy
> already, ie, if the fop comes with an iobuf/iobref, then the same buffer
> is used for copying the received data from network.
> (client_submit_request() is designed to handle this). [1]
> We made all these changes to make RDMA 0-copy a possibility, so even
> RDMA transport should be already 0-copy friendly.
> Thats my understanding.
> Regards,
> Amar
> [1] - recent patches to handle RPC read-ahead may involve small data
> copy from header to data buffer, but surely not very high.

Amar - note that the current infrastructure present for 0-copy RDMA 
might not be sufficient for GFAPI's 0-copy. A glfs_readv() request from 
the app can come as a vector of memory pointers (and not a contiguous 
iobuf) and therefore require storing an iovec/count as well. This might 
also mean we need to exercise the scatter-gather aspects of the verbs API.


More information about the Gluster-devel mailing list