[Gluster-devel] gfapi zero copy write enhancement
Shyam
srangana at redhat.com
Wed Aug 24 19:28:19 UTC 2016
Hi,
I was attempting to review this [1] change, and for a long time I wanted
to understand why we need this and what is the manner in which we need
to achieve the same. As there is a lack of understanding on my part I am
starting with some questions.
1) In writev FOP what is the role/use of the iobref parameter?
- I do not see the posix xlator using this
- The payload is carried in vector, rather than in the iobref
- Across the code, other than protocol/client that (sorf of) serializes
this, I do not see it having any use
So what am I missing?
2) Coming to the current change, what prevents us from doing this as [2]?
- in short just pass in the buffer received as a part of iovec
[2] is not necessarily clean, just a hack, that assumes there is a
iovcount of 1 always, and I just tested this with a write call, and not
a writev call. (just stating before we start a code review of the same ;) )
3) From discussions in [3] and [4] I understand that this is to
eliminate copy when working with RDMA. Further, Avati's response to the
thought, discusses why we need to leave the memory management of
read/write buffers to the applications than use/reuse gluster buffers.
So, in the long term if this is for RDMA, what is the change in
justification for the manner in which we are asking applications to use
gluster buffers, than doing it the other way?
4) Why should applications not reuse buffers? and instead ask for
fresh/new buffers for each write call?
5) What is the performance gain noticed with this approach? As the
thread that (re)started this is [5]. In Commvault Simpana,
- What were the perf gains due to this approach?
- How does the application use the write buffers?
- Meaning, the buffer that is received from Gluster is used to
populate data from the network? I am curious as to how this application
uses these buffers, and where does data get copied into these buffers from.
Eliminating a copy in glfs_write seems trivial (if my hack and answers
to the first question are as per my assumptions), I am wondering what we
are attempting here, or what I am missing.
Thanks,
Shyam
[1] http://review.gluster.org/#/c/14784/
[2] https://paste.fedoraproject.org/413514/14720651/
[3] https://www.mail-archive.com/gluster-devel@gluster.org/msg03347.html
[4] http://www.gluster.org/pipermail/gluster-devel/2015-February/043966.html
[5] http://www.gluster.org/pipermail/gluster-devel/2016-June/049858.html
More information about the Gluster-devel
mailing list