[Gluster-devel] libgfapi zero copy write - application in samba, nfs-ganesha

Ric Wheeler ricwheeler at gmail.com
Tue Sep 27 06:25:40 UTC 2016

On 09/27/2016 08:53 AM, Raghavendra Gowdappa wrote:
> ----- Original Message -----
>> From: "Ric Wheeler" <rwheeler at redhat.com>
>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>, "Saravanakumar Arumugam" <sarumuga at redhat.com>
>> Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Ben Turner" <bturner at redhat.com>, "Ben England"
>> <bengland at redhat.com>
>> Sent: Tuesday, September 27, 2016 10:51:48 AM
>> Subject: Re: [Gluster-devel] libgfapi zero copy write - application in samba, nfs-ganesha
>> On 09/27/2016 07:56 AM, Raghavendra Gowdappa wrote:
>>> +Manoj, +Ben turner, +Ben England.
>>> @Perf-team,
>>> Do you think the gains are significant enough, so that smb and nfs-ganesha
>>> team can start thinking about consuming this change?
>>> regards,
>>> Raghavendra
>> This is a large gain but I think that we might see even larger gains (a lot
>> depends on how we implement copy offload :)).
> Can you elaborate on what you mean "copy offload"? If it is the way we avoid a copy in gfapi (from application buffer), following is the workflow:
> <commit>
> Work flow of zero copy write operation:
> --------------------------------------
> 1) Application requests a buffer of specific size. A new buffer is
> allocated from iobuf pool, and this buffer is passed on to application.
>     Achieved using "glfs_get_buffer"
> 2) Application writes into the received buffer, and passes that to
> libgfapi, and libgfapi in turn passes the same buffer to underlying
> translators. This avoids a memcpy in glfs write
>     Achieved using "glfs_zero_write"
> 3) Once the write operation is complete, Application must take the
> responsibilty of freeing the buffer.
>     Achieved using "glfs_free_buffer"
> </commit>
> Do you've any suggestions/improvements on this? I think Shyam mentioned an alternative approach (for zero-copy readv I think), let me look up at that too.
> regards,
> Raghavendra

Both NFS and SMB support a copy offload that allows a client to produce a new 
copy of a file without bringing data over the wire. Both, if I remember 
correctly, do a ranged copy within a file.

The key here is that since the data does not move over the wire from server to 
client, we can shift the performance bottleneck to the storage server.

If we have a slow (1GB) link between client and server, we should be able to do 
that copy as if it happened just on the server itself. For a single NFS server 
(not a clustered, scale out server), that usually means we are as fast as the 
local file system copy.

Note that there are also servers that simply "reflink" that file, so we have a 
very small amount of time needed on the server to produce that copy.  This can 
be a huge win for say a copy of a virtual machine guest image.

Gluster and other distributed servers won't benefit as much as a local server 
would I suspect because of the need to do things internally over our networks 
between storage server nodes.

Hope that makes my thoughts clearer?

Here is a link to a brief overview of the new Linux system call:


Note that block devices or pseudo devices can also implement a copy offload.



>> Worth looking at how we can make use of it.
>> thanks!
>> Ric
>>> ----- Original Message -----
>>>> From: "Saravanakumar Arumugam" <sarumuga at redhat.com>
>>>> To: "Gluster Devel" <gluster-devel at gluster.org>
>>>> Sent: Monday, September 26, 2016 7:18:26 PM
>>>> Subject: [Gluster-devel] libgfapi zero copy write - application in samba,
>>>> 	nfs-ganesha
>>>> Hi,
>>>> I have carried out "basic" performance measurement with zero copy write
>>>> APIs.
>>>> Throughput of zero copy write is 57 MB/sec vs default write 43 MB/sec.
>>>> ( I have modified Ben England's gfapi_perf_test.c for this. Attached the
>>>> same
>>>> for reference )
>>>> We would like to hear how samba/ nfs-ganesha who are libgfapi users can
>>>> make
>>>> use of this.
>>>> Please provide your comments. Refer attached results.
>>>> Zero copy in write patch: http://review.gluster.org/#/c/14784/
>>>> Thanks,
>>>> Saravana
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

More information about the Gluster-devel mailing list