[Gluster-devel] Question about file copy through libgfapi

Brad Hubbard bhubbard at redhat.com
Sat Aug 23 01:24:26 UTC 2014


On 08/22/2014 11:27 PM, Niels de Vos wrote:
> On Fri, Aug 22, 2014 at 01:26:02PM +0200, Giacomo Fazio wrote:
>> Hi there,
>>
>> Thanks to both Soumya and Prashanth. Actually you are both right. With the
>> approach proposed by Soumya I would avoid the FUSE overhead but, as
>> Prashanth says, the network transfer overhead would be always present. This
>> is particularly important for me because I deal with very big files
>> (usually around 100 GB and even more), so that network transfer have a big
>> impact, while I don't think the impact of the FUSE overhead is that big.
>> That's why what I would like to get is a "brick to brick" copy (just server
>> side), so I would like to use the APIs to order the server to make a copy,
>> so that the network transfer can be avoided.
>>
>> As far as I understood, it is not currently possible with libgfapi. Do you
>> think it would be difficult to implement? Are there any other ways?
>> Thank you and best regards,
>
> "Difficult" is always relative, it depends on many factors :) But
> I think implementing server-side copy is quite doable. You should start
> with thinking of, and proposing a design. Some ideas that would work:
>
> a.
>      Have a server-side daemon (maybe glusterd) handle the copy. Some new
>      libgfapi or gluster-cli function can then connect to the daemon and
>      pass the instruction on (src brick + dst volume + src+dst filename).
>      This daemon can then connect to it's instance on the server hosting
>      the source-brick, and initiate the copy.
>
> b.
>      Add a new file operation to the GlusterFS protocol, something like
>      copy-to-brick. This operation would receive the request from the
>      client (the client talks to the src-brick hosting the src-file as
>      usual), and the brick process needs to learn how to connect to an
>      other brick (from a different volume) and create/write the file
>      there. The client application should be smart enough to pass the
>      path to the dst-brick that should contain the dst-file.

Might there not be a 'c' option to implement this as a translator and 
intercept a standard posix call such as getfattr using some trigger as 
has been done in the past (getfattr -n "trusted.glusterfs.pathinfo")? 
This is a bit of a "hack" but probably easier to implement and use with 
gluster as it is now.

>
> While writing this, I have convinced myself that (a) would surely be
> easier to do.  GlusterD could spawn a special copy process (like
> a libgfapi client) that connects to the source and destination volumes,
> do the copy, and exit.
>
> This also makes it much easier to start contributing!
>
> 1.
>      A relatively simple libgfapi binary that implements "cp" with
>      volume:/path/to/file as parameters should not be too difficult. Of
>      course, you may need to mkdir the (parent) structure on the
>      destination too, possibly adding a "-r" option for recursive
>      copying.
>
> 2.
>      A second step could then integrate this cp/libgfapi implementation
>      in some gluster-cli/glusterd procedures.
>
> 3.
>      Making it smarter and initiate the copy from one of the source
>      bricks, can then be an other step.
>
>
> For (1), it could be easier to extend some available copy-tool.
> Something like rsync already supports different protocols. Maybe it is
> possible to teach rsync how and what functions to call from libgfapi.
> rsync supports many useful options already, writing a new cp/libgfapi
> from scratch that matches only a subset from the features that rsync
> has, will be a major project.
>
> The above are just some ideas, thinking out loud... But, starting with
> integrating libgfapi in rsync or similar sounds like a major usability
> improvement for many Gluster users.
>
> Niels
>
>
>>
>> *Giacomo Fazio*
>> IT Engineer
>>
>> Tel. +41 91 910 7690
>> E-mail: giacomo.fazio at wcpmediaservices.com  |  Web: www.wcpmediaservices.com
>>
>> Europe Office: Via Zurigo 35, 6900 Lugano, Switzerland
>> USA Office: 7083 Hollywood Boulevard Los Angeles, CA 90028
>>
>>
>> On Fri, Aug 22, 2014 at 9:36 AM, Prashanth Pai <ppai at redhat.com> wrote:
>>
>>> Hi,
>>>
>>> Even with that approach, data would still be read (over the n/w) at the
>>> client (the app using libgfapi). I think what he is looking for is a server
>>> side copy (brick to brick) or within same brick _without_ the need for data
>>> to go through client.
>>>
>>> Swift has this feature[1] and it would be really cool for glusterfs to
>>> have it (may be as an external tool or as a API in libgfapi) :)
>>>
>>> # gluster-copy <src> <dest>
>>> or
>>> glfs_copy(src,dest)
>>>
>>> [1]
>>> http://programmerthoughts.com/openstack/server-side-object-copy-in-openstack-storage/
>>>
>>>
>>>
>>> Regards,
>>>   -Prashanth Pai
>>>
>>> ----- Original Message -----
>>> From: "Soumya Koduri" <skoduri at redhat.com>
>>> To: "Giacomo Fazio" <giacomo.fazio at wcpmediaservices.com>, "John Mark
>>> Walker" <johnmark at gluster.org>
>>> Cc: gluster-devel at gluster.org, "Giovanni Contri" <
>>> giovanni.contri at wcpmediaservices.com>, forge-admin at gluster.org
>>> Sent: Friday, August 22, 2014 12:40:01 PM
>>> Subject: Re: [Gluster-devel] Question about file copy through libgfapi
>>>
>>> Hi Giacomo,
>>>
>>> If your requirement is to get away with fuse/protocol clients and do
>>> server-side operations, I think its doable by writing a simple libgfapi
>>> application. But since there is no libgfapi API equivalent to "cp"
>>> command, you may need to implement that functionality using "glfs_open,
>>> glfs_read & glfs_write" APIs.
>>>
>>> Here are the few links which Humble has documented on how to use
>>> libgfapi and different APIs supported by it-
>>>
>>> http://humblec.com/libgfapi-interface-glusterfs/
>>> https://github.com/gluster/glusterfs/blob/master/doc/features/libgfapi.md
>>>
>>>
>>> Few sample examples (written in 'C' and 'python') are copied to -
>>> https://github.com/gluster/glusterfs/tree/master/api/examples
>>>
>>>
>>> Thanks,
>>> Soumya
>>>
>>>
>>>
>>> On 08/21/2014 08:45 PM, Giacomo Fazio wrote:
>>>> Hi John,
>>>>
>>>> Thanks for your quick answer. Do you mean that my question can be
>>>> summarized in "can we do server-only operations?"? Yes, I think so.
>>>> Please let me know as soon as you receive any answer or provide me a
>>>> link where I can follow directly this case.
>>>> Thanks in advance and best regards,
>>>>
>>>> *Giacomo Fazio*
>>>> IT Engineer
>>>>
>>>> Tel. +41 91 910 7690
>>>> E-mail:Â giacomo.fazio at wcpmediaservices.com
>>>> <mailto:giacomo.fazio at wcpmediaservices.com>Â  |Â Â Web:Â
>>>> www.wcpmediaservices.com <http://www.wcpmediaservices.com>
>>>>
>>>> Europe Office:Â Via Zurigo 35, 6900 Lugano, Switzerland
>>>> USA Office:Â 7083 Hollywood Boulevard Los Angeles, CA 90028
>>>>
>>>>
>>>> On Thu, Aug 21, 2014 at 5:04 PM, John Mark Walker <johnmark at gluster.org
>>>> <mailto:johnmark at gluster.org>> wrote:
>>>>
>>>>      Thanks, Giacomo. I'm sending this to the gluster-devel list - it's
>>>>      an interesting question. Basically, can we do server-only operations?
>>>>
>>>>      -JM
>>>>
>>>>
>>>>
>>>   ------------------------------------------------------------------------
>>>>
>>>>          Hello,
>>>>
>>>>          I am currently using GlusterFS version 3.5 with two bricks. What
>>>>          I currently do is mounting the whole storage in some Linux
>>>>          clients (RedHat) through fuse.glusterfs that (I think) uses NFS
>>>>          in the background.
>>>>          What I would like to do is copying a file from a directory to
>>>>          another one in the storage in the quickest way. Using a "cp
>>>>          file1 file2" from my RedHat client is not the best option
>>>>          because the data flows from the storage to my RedHat client
>>>>          through the network and then back to the storage. I would like
>>>>          instead to avoid this waste of time and copy the file directly
>>>>          from the 1st directory to the 2nd one. So, in a nutshell, I
>>>>          would like to have file1 -> file2Â  , instead of file1 ->
>>>>          RedHatclient -> file2
>>>>          Do you think is it possible, for example using libgfapi? Any
>>>>          example to show me?
>>>>          Thank you in advance and best regards,
>>>>
>>>>          *Giacomo Fazio*
>>>>          IT Engineer
>>>>
>>>>          Tel. +41 91 910 7690 <tel:%2B41%2091%20910%207690>
>>>>          E-mail:Â giacomo.fazio at wcpmediaservices.com
>>>>          <mailto:giacomo.fazio at wcpmediaservices.com>Â  |Â Â Web:Â
>>>>          www.wcpmediaservices.com <http://www.wcpmediaservices.com>
>>>>
>>>>          Europe Office:Â Via Zurigo 35, 6900 Lugano, Switzerland
>>>>          USA Office:Â 7083 Hollywood Boulevard Los Angeles, CA 90028
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>>>
>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>


-- 

Kindest Regards,

Brad Hubbard
Senior Software Maintenance Engineer
Red Hat Global Support Services
Asia Pacific Region


More information about the Gluster-devel mailing list