[Gluster-devel] gfapi zero copy write enhancement

Shyam srangana at redhat.com
Wed Aug 24 19:28:19 UTC 2016


Hi,

I was attempting to review this [1] change, and for a long time I wanted 
to understand why we need this and what is the manner in which we need 
to achieve the same. As there is a lack of understanding on my part I am 
starting with some questions.

1) In writev FOP what is the role/use of the iobref parameter?
- I do not see the posix xlator using this
- The payload is carried in vector, rather than in the iobref
- Across the code, other than protocol/client that (sorf of) serializes 
this, I do not see it having any use

So what am I missing?

2) Coming to the current change, what prevents us from doing this as [2]?
- in short just pass in the buffer received as a part of iovec

[2] is not necessarily clean, just a hack, that assumes there is a 
iovcount of 1 always, and I just tested this with a write call, and not 
a writev call. (just stating before we start a code review of the same ;) )

3) From discussions in [3] and [4] I understand that this is to 
eliminate copy when working with RDMA. Further, Avati's response to the 
thought, discusses why we need to leave the memory management of 
read/write buffers to the applications than use/reuse gluster buffers.

So, in the long term if this is for RDMA, what is the change in 
justification for the manner in which we are asking applications to use 
gluster buffers, than doing it the other way?

4) Why should applications not reuse buffers? and instead ask for 
fresh/new buffers for each write call?

5) What is the performance gain noticed with this approach? As the 
thread that (re)started this is [5]. In Commvault Simpana,
- What were the perf gains due to this approach?
- How does the application use the write buffers?
   - Meaning, the buffer that is received from Gluster is used to 
populate data from the network? I am curious as to how this application 
uses these buffers, and where does data get copied into these buffers from.

Eliminating a copy in glfs_write seems trivial (if my hack and answers 
to the first question are as per my assumptions), I am wondering what we 
are attempting here, or what I am missing.

Thanks,
Shyam

[1] http://review.gluster.org/#/c/14784/
[2] https://paste.fedoraproject.org/413514/14720651/
[3] https://www.mail-archive.com/gluster-devel@gluster.org/msg03347.html
[4] http://www.gluster.org/pipermail/gluster-devel/2015-February/043966.html
[5] http://www.gluster.org/pipermail/gluster-devel/2016-June/049858.html


More information about the Gluster-devel mailing list