[Gluster-devel] Question about file copy through libgfapi

Niels de Vos ndevos at redhat.com
Fri Aug 22 13:27:39 UTC 2014


On Fri, Aug 22, 2014 at 01:26:02PM +0200, Giacomo Fazio wrote:
> Hi there,
> 
> Thanks to both Soumya and Prashanth. Actually you are both right. With the
> approach proposed by Soumya I would avoid the FUSE overhead but, as
> Prashanth says, the network transfer overhead would be always present. This
> is particularly important for me because I deal with very big files
> (usually around 100 GB and even more), so that network transfer have a big
> impact, while I don't think the impact of the FUSE overhead is that big.
> That's why what I would like to get is a "brick to brick" copy (just server
> side), so I would like to use the APIs to order the server to make a copy,
> so that the network transfer can be avoided.
> 
> As far as I understood, it is not currently possible with libgfapi. Do you
> think it would be difficult to implement? Are there any other ways?
> Thank you and best regards,

"Difficult" is always relative, it depends on many factors :) But 
I think implementing server-side copy is quite doable. You should start 
with thinking of, and proposing a design. Some ideas that would work:

a.
    Have a server-side daemon (maybe glusterd) handle the copy. Some new 
    libgfapi or gluster-cli function can then connect to the daemon and 
    pass the instruction on (src brick + dst volume + src+dst filename).  
    This daemon can then connect to it's instance on the server hosting 
    the source-brick, and initiate the copy.

b.
    Add a new file operation to the GlusterFS protocol, something like 
    copy-to-brick. This operation would receive the request from the 
    client (the client talks to the src-brick hosting the src-file as 
    usual), and the brick process needs to learn how to connect to an 
    other brick (from a different volume) and create/write the file 
    there. The client application should be smart enough to pass the 
    path to the dst-brick that should contain the dst-file.

While writing this, I have convinced myself that (a) would surely be 
easier to do.  GlusterD could spawn a special copy process (like 
a libgfapi client) that connects to the source and destination volumes, 
do the copy, and exit.

This also makes it much easier to start contributing!

1.
    A relatively simple libgfapi binary that implements "cp" with 
    volume:/path/to/file as parameters should not be too difficult. Of 
    course, you may need to mkdir the (parent) structure on the 
    destination too, possibly adding a "-r" option for recursive 
    copying.

2.
    A second step could then integrate this cp/libgfapi implementation 
    in some gluster-cli/glusterd procedures.

3.
    Making it smarter and initiate the copy from one of the source 
    bricks, can then be an other step.


For (1), it could be easier to extend some available copy-tool. 
Something like rsync already supports different protocols. Maybe it is 
possible to teach rsync how and what functions to call from libgfapi.  
rsync supports many useful options already, writing a new cp/libgfapi 
from scratch that matches only a subset from the features that rsync 
has, will be a major project.

The above are just some ideas, thinking out loud... But, starting with 
integrating libgfapi in rsync or similar sounds like a major usability 
improvement for many Gluster users.

Niels


> 
> *Giacomo Fazio*
> IT Engineer
> 
> Tel. +41 91 910 7690
> E-mail: giacomo.fazio at wcpmediaservices.com  |  Web: www.wcpmediaservices.com
> 
> Europe Office: Via Zurigo 35, 6900 Lugano, Switzerland
> USA Office: 7083 Hollywood Boulevard Los Angeles, CA 90028
> 
> 
> On Fri, Aug 22, 2014 at 9:36 AM, Prashanth Pai <ppai at redhat.com> wrote:
> 
> > Hi,
> >
> > Even with that approach, data would still be read (over the n/w) at the
> > client (the app using libgfapi). I think what he is looking for is a server
> > side copy (brick to brick) or within same brick _without_ the need for data
> > to go through client.
> >
> > Swift has this feature[1] and it would be really cool for glusterfs to
> > have it (may be as an external tool or as a API in libgfapi) :)
> >
> > # gluster-copy <src> <dest>
> > or
> > glfs_copy(src,dest)
> >
> > [1]
> > http://programmerthoughts.com/openstack/server-side-object-copy-in-openstack-storage/
> >
> >
> >
> > Regards,
> >  -Prashanth Pai
> >
> > ----- Original Message -----
> > From: "Soumya Koduri" <skoduri at redhat.com>
> > To: "Giacomo Fazio" <giacomo.fazio at wcpmediaservices.com>, "John Mark
> > Walker" <johnmark at gluster.org>
> > Cc: gluster-devel at gluster.org, "Giovanni Contri" <
> > giovanni.contri at wcpmediaservices.com>, forge-admin at gluster.org
> > Sent: Friday, August 22, 2014 12:40:01 PM
> > Subject: Re: [Gluster-devel] Question about file copy through libgfapi
> >
> > Hi Giacomo,
> >
> > If your requirement is to get away with fuse/protocol clients and do
> > server-side operations, I think its doable by writing a simple libgfapi
> > application. But since there is no libgfapi API equivalent to "cp"
> > command, you may need to implement that functionality using "glfs_open,
> > glfs_read & glfs_write" APIs.
> >
> > Here are the few links which Humble has documented on how to use
> > libgfapi and different APIs supported by it-
> >
> > http://humblec.com/libgfapi-interface-glusterfs/
> > https://github.com/gluster/glusterfs/blob/master/doc/features/libgfapi.md
> >
> >
> > Few sample examples (written in 'C' and 'python') are copied to -
> > https://github.com/gluster/glusterfs/tree/master/api/examples
> >
> >
> > Thanks,
> > Soumya
> >
> >
> >
> > On 08/21/2014 08:45 PM, Giacomo Fazio wrote:
> > > Hi John,
> > >
> > > Thanks for your quick answer. Do you mean that my question can be
> > > summarized in "can we do server-only operations?"? Yes, I think so.
> > > Please let me know as soon as you receive any answer or provide me a
> > > link where I can follow directly this case.
> > > Thanks in advance and best regards,
> > >
> > > *Giacomo Fazio*
> > > IT Engineer
> > >
> > > Tel. +41 91 910 7690
> > > E-mail:Â giacomo.fazio at wcpmediaservices.com
> > > <mailto:giacomo.fazio at wcpmediaservices.com>Â  |Â Â Web:Â
> > > www.wcpmediaservices.com <http://www.wcpmediaservices.com>
> > >
> > > Europe Office:Â Via Zurigo 35, 6900 Lugano, Switzerland
> > > USA Office:Â 7083 Hollywood Boulevard Los Angeles, CA 90028
> > >
> > >
> > > On Thu, Aug 21, 2014 at 5:04 PM, John Mark Walker <johnmark at gluster.org
> > > <mailto:johnmark at gluster.org>> wrote:
> > >
> > >     Thanks, Giacomo. I'm sending this to the gluster-devel list - it's
> > >     an interesting question. Basically, can we do server-only operations?
> > >
> > >     -JM
> > >
> > >
> > >
> >  ------------------------------------------------------------------------
> > >
> > >         Hello,
> > >
> > >         I am currently using GlusterFS version 3.5 with two bricks. What
> > >         I currently do is mounting the whole storage in some Linux
> > >         clients (RedHat) through fuse.glusterfs that (I think) uses NFS
> > >         in the background.
> > >         What I would like to do is copying a file from a directory to
> > >         another one in the storage in the quickest way. Using a "cp
> > >         file1 file2" from my RedHat client is not the best option
> > >         because the data flows from the storage to my RedHat client
> > >         through the network and then back to the storage. I would like
> > >         instead to avoid this waste of time and copy the file directly
> > >         from the 1st directory to the 2nd one. So, in a nutshell, I
> > >         would like to have file1 -> file2Â  , instead of file1 ->
> > >         RedHatclient -> file2
> > >         Do you think is it possible, for example using libgfapi? Any
> > >         example to show me?
> > >         Thank you in advance and best regards,
> > >
> > >         *Giacomo Fazio*
> > >         IT Engineer
> > >
> > >         Tel. +41 91 910 7690 <tel:%2B41%2091%20910%207690>
> > >         E-mail:Â giacomo.fazio at wcpmediaservices.com
> > >         <mailto:giacomo.fazio at wcpmediaservices.com>Â  |Â Â Web:Â
> > >         www.wcpmediaservices.com <http://www.wcpmediaservices.com>
> > >
> > >         Europe Office:Â Via Zurigo 35, 6900 Lugano, Switzerland
> > >         USA Office:Â 7083 Hollywood Boulevard Los Angeles, CA 90028
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel at gluster.org
> > > http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> > >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> >

> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel



More information about the Gluster-devel mailing list