[Gluster-devel] libgfapi zero copy write - application in samba, nfs-ganesha

Thu Sep 29 05:41:59 UTC 2016

On Wed, Sep 28, 2016 at 7:37 PM, Shyam <srangana at redhat.com> wrote:

> On 09/27/2016 04:02 AM, Poornima Gurusiddaiah wrote:
>
>> W.r.t Samba consuming this, it requires a great deal of code change in
>> Samba.
>> Currently samba has no concept of getting buf from the underlying file
>> system,
>> the filesystem comes into picture only at the last layer(gluster plugin),
>> where system calls are replaced by libgfapi calls. Hence, this is not
>> readily
>> consumable by Samba, and i think same will be the case with NFS_Ganesha,
>> will
>> let the Ganesha folksc comment on the same.
>>
>
> This is exactly my reservation about the nature of change [2] that is done
> in this patch. We expect all consumers to use *our* buffer management
> system, which may not be possible all the time.
>
> From the majority of consumers that I know of, other than what Sachin
> stated as an advantage for CommVault, none of the others can use the
> gluster buffers at the moment (Ganesha, SAMBA, qemu. (I would like to
> understand how CommVault can use gluster buffers in this situation without
> copying out data to the same, just for clarity).
>

+Jeff cody, for comments on QEMU

>
> This is the reason I posted the comments at [1], stating we should copy
> out the buffer, when Gluster needs it preserved, but use application
> provided buffers as long as we can.
>

My concerns here are:

* We are just moving the copy from gfapi layer to write-behind. Though I am
not sure what percentage of writes that hit write-behind are
"written-back", I would assume it to be a significant percentage (otherwise
there is no benefit in having write-behind). However, we can try this
approach and get some perf data before we make a decision.

* Buffer management. All gluster code uses iobuf/iobrefs to manage the
buffers of relatively large size. With the approach suggested above, I see
two concerns:
    a. write-behind has to differentiate between iobufs that need copying
(write calls through gfapi layer) and iobufs that can just be refed (writes
from fuse etc) when "writing-back" the write. This adds more complexity.
    b. For the case where write-behind chooses to not "write-back" the
write, we need a way of encapsulating the application buffer into
iobuf/iobref. This might need changes in iobuf infra.

> I do see the advantages of zero-copy, but not when gluster api is managing
> the buffers, it just makes it more tedious for applications to use this
> scheme, IMHO.
>
> Could we think and negate (if possible) thoughts around using the
> application passed buffers as is? One caveat here seems to be when using
> RDMA (we need the memory registered if I am not wrong), as that would
> involve a copy to RDMA buffers when using application passed buffers.

Actually RDMA is not a problem in the current implementation (ruling out
suggestions by others to use a pre-registered iobufs  for managing io-cache
etc). This is because, in current implementation the responsibility of
registering the memory region lies in transport/rdma. In other words
transport/rdma doesn't expect pre-registered buffers.

What are the other pitfalls?
>
> [1] http://www.gluster.org/pipermail/gluster-devel/2016-August/050622.html
>
> [2] http://review.gluster.org/#/c/14784/
>
>
>>
>> Regards,
>> Poornima
>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>

-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160929/0fe0a534/attachment-0001.html>