[Gluster-devel] zero-copy readv
Anand Avati
aavati at redhat.com
Thu Jan 10 06:50:09 UTC 2013
On 01/09/2013 10:37 PM, Amar Tumballi wrote:
>
>>
>> - On the read side things are a little more complicated. In
>> rpc-transport/socket, there is a call to iobuf_get() to create a new
>> iobuf for reading in the readv reply data from the server. We will need
>> a framework changes where, if the readv request (of the xid for which
>> readv reply is being handled) happened to be a "direct" variant (i.e,
>> zero-copy), then the "special iobuf around user's memory" gets picked up
>> and read() from socket is performed directly into user's memory.
>> Similar, but equivalent, changes will have to be done in RDMA
>> (Raghavendra on CC can help). Since the goal is to avoid memory copy,
>> this data will be bypassing io-cache (and purging pre-cached data of
>> those regions along the way).
>>
>
> On the read side too, our client protocol is designed to handle 0-copy
> already, ie, if the fop comes with an iobuf/iobref, then the same buffer
> is used for copying the received data from network.
> (client_submit_request() is designed to handle this). [1]
>
> We made all these changes to make RDMA 0-copy a possibility, so even
> RDMA transport should be already 0-copy friendly.
>
> Thats my understanding.
>
> Regards,
> Amar
>
> [1] - recent patches to handle RPC read-ahead may involve small data
> copy from header to data buffer, but surely not very high.
>
Amar - note that the current infrastructure present for 0-copy RDMA
might not be sufficient for GFAPI's 0-copy. A glfs_readv() request from
the app can come as a vector of memory pointers (and not a contiguous
iobuf) and therefore require storing an iovec/count as well. This might
also mean we need to exercise the scatter-gather aspects of the verbs API.
Avati
More information about the Gluster-devel
mailing list