[Gluster-devel] libgfapi zero copy write - application in samba, nfs-ganesha

Niels de Vos ndevos at redhat.com
Tue Sep 27 09:30:36 UTC 2016


On Tue, Sep 27, 2016 at 09:25:40AM +0300, Ric Wheeler wrote:
> On 09/27/2016 08:53 AM, Raghavendra Gowdappa wrote:
> > 
> > ----- Original Message -----
> > > From: "Ric Wheeler" <rwheeler at redhat.com>
> > > To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>, "Saravanakumar Arumugam" <sarumuga at redhat.com>
> > > Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Ben Turner" <bturner at redhat.com>, "Ben England"
> > > <bengland at redhat.com>
> > > Sent: Tuesday, September 27, 2016 10:51:48 AM
> > > Subject: Re: [Gluster-devel] libgfapi zero copy write - application in samba, nfs-ganesha
> > > 
> > > On 09/27/2016 07:56 AM, Raghavendra Gowdappa wrote:
> > > > +Manoj, +Ben turner, +Ben England.
> > > > 
> > > > @Perf-team,
> > > > 
> > > > Do you think the gains are significant enough, so that smb and nfs-ganesha
> > > > team can start thinking about consuming this change?
> > > > 
> > > > regards,
> > > > Raghavendra
> > > This is a large gain but I think that we might see even larger gains (a lot
> > > depends on how we implement copy offload :)).
> > Can you elaborate on what you mean "copy offload"? If it is the way we avoid a copy in gfapi (from application buffer), following is the workflow:
> > 
> > <commit>
> > 
> > Work flow of zero copy write operation:
> > --------------------------------------
> > 
> > 1) Application requests a buffer of specific size. A new buffer is
> > allocated from iobuf pool, and this buffer is passed on to application.
> >     Achieved using "glfs_get_buffer"
> > 
> > 2) Application writes into the received buffer, and passes that to
> > libgfapi, and libgfapi in turn passes the same buffer to underlying
> > translators. This avoids a memcpy in glfs write
> >     Achieved using "glfs_zero_write"
> > 
> > 3) Once the write operation is complete, Application must take the
> > responsibilty of freeing the buffer.
> >     Achieved using "glfs_free_buffer"
> > 
> > </commit>
> > 
> > Do you've any suggestions/improvements on this? I think Shyam mentioned an alternative approach (for zero-copy readv I think), let me look up at that too.
> > 
> > regards,
> > Raghavendra
> 
> Both NFS and SMB support a copy offload that allows a client to produce a
> new copy of a file without bringing data over the wire. Both, if I remember
> correctly, do a ranged copy within a file.
> 
> The key here is that since the data does not move over the wire from server
> to client, we can shift the performance bottleneck to the storage server.
> 
> If we have a slow (1GB) link between client and server, we should be able to
> do that copy as if it happened just on the server itself. For a single NFS
> server (not a clustered, scale out server), that usually means we are as
> fast as the local file system copy.
> 
> Note that there are also servers that simply "reflink" that file, so we have
> a very small amount of time needed on the server to produce that copy.  This
> can be a huge win for say a copy of a virtual machine guest image.
> 
> Gluster and other distributed servers won't benefit as much as a local
> server would I suspect because of the need to do things internally over our
> networks between storage server nodes.
> 
> Hope that makes my thoughts clearer?
> 
> Here is a link to a brief overview of the new Linux system call:
> 
> https://kernelnewbies.org/Linux_4.5#head-6df3d298d8e0afa8e85e1125cc54d5f13b9a0d8c
> 
> Note that block devices or pseudo devices can also implement a copy offload.

Last week I shared an idea about doing server-side-copy with a few
developers. I plan to send out a bit more details to the devel list
later this week. Feedback by email, or in person at the Gluster Summit
next week would be welcome.

A first iteration would trigger a server-side-copy to a normal gfapi
application running as a service inside the storage environment. This
service will just do a read+write from 'localhost' to whatever bricks
contain the destination file. Further optimizations by reflinking and
other techniques should be possible to add later on.

This is a preview of what I currently have in an etherpad:
  https://public.pad.fsfe.org/p/gluster-server-side-copy?useMonospaceFont=true

Niels


> 
> Regards,
> 
> Ric
> 
> > 
> > > Worth looking at how we can make use of it.
> > > 
> > > thanks!
> > > 
> > > Ric
> > > 
> > > > ----- Original Message -----
> > > > > From: "Saravanakumar Arumugam" <sarumuga at redhat.com>
> > > > > To: "Gluster Devel" <gluster-devel at gluster.org>
> > > > > Sent: Monday, September 26, 2016 7:18:26 PM
> > > > > Subject: [Gluster-devel] libgfapi zero copy write - application in samba,
> > > > > 	nfs-ganesha
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > I have carried out "basic" performance measurement with zero copy write
> > > > > APIs.
> > > > > Throughput of zero copy write is 57 MB/sec vs default write 43 MB/sec.
> > > > > ( I have modified Ben England's gfapi_perf_test.c for this. Attached the
> > > > > same
> > > > > for reference )
> > > > > 
> > > > > We would like to hear how samba/ nfs-ganesha who are libgfapi users can
> > > > > make
> > > > > use of this.
> > > > > Please provide your comments. Refer attached results.
> > > > > 
> > > > > Zero copy in write patch: http://review.gluster.org/#/c/14784/
> > > > > 
> > > > > Thanks,
> > > > > Saravana
> > > 
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160927/2a53e421/attachment-0001.sig>


More information about the Gluster-devel mailing list