[Gluster-devel] Feature help

Wed Nov 12 11:57:40 UTC 2014

Past couple of days have been slow for me and I'm new to the FOP stuff
so I will try and finish something available soon with potentially
some benchmarks. Will try and post the feature page and make some of
the code available.

The API can always be expanded to have atomic stream writes - it's
ends up being an overwrite/append operation so suggestions are very
welcome.

On Tue, Nov 11, 2014 at 1:59 PM, Shyam <srangana at redhat.com> wrote:
> On 11/11/2014 06:40 AM, Rudra Siva wrote:
>>
>> Responses inline ... (removed some of my older parts of post).
>>
>> On Mon, Nov 10, 2014 at 2:11 PM, Shyam <srangana at redhat.com> wrote:
>>>
>>> On 11/01/2014 10:20 AM, Rudra Siva wrote:
>>>>
>>>>
>>>
>>> The response below is based on, reading into this mail and the other mail
>>> that you sent titled "libgfapi object api" which I believe expands on
>>> that
>>> actual APIs that you are thinking of. (the following commentary is to
>>> glean
>>> more information, as this is something that can help small file
>>> performance,
>>> and could be the result of my own misunderstanding :) )
>>>
>>> - Who are the consumers of such an API?
>>> The way I see it, FUSE does not have a direct way to use this
>>> enhancement,
>>> unless we think of ways, like the ones that Ben proposed to defer and
>>> detect
>>> small file creates.
>>>
>>> Neither does NFS or SMB protocol implementations.
>>>
>>> Swift has a use case here, as they need to put/get objects atomically and
>>> can have a good benefit of having a single API rather than plough through
>>> multiple ones and ensuring atomicity using renames (again stated by Ben
>>> in
>>> the other mail). BUT, we cannot have the entire object that we need to
>>> write
>>> and then invoke the API (consider an object 1GB in size). So instead
>>> Swift
>>> would have to use this when it has an entire object and do some
>>> optimization
>>> like the ones suggested for FUSE like Ben.
>>>
>>> Hence the question, who are the consumers of this API?
>>>
>> It should help applications and developers that are reading/writing
>> small files which they know are small files to begin with. They may
>> know they are dealing with a small files (by way of size, location
>> etc.) - swift is one example application but in the real world there
>> are probably many others - and they should benefit (at-least that's
>> the intent).
>>
>
> Agreed, do you have an example of such an application in mind? It helps to
> design the API if we have multiple examples, so that it can be as generic as
> needed.
>
> For example, Swift *put*, streams data to the object store (I use terms
> loosely here), so the caller of this API would not wait to accumulate the
> write stream, for the atomic nature of the current suggested write. Rather I
> would want to stream write (if possible) to be atomic, that can handle the
> create/open/write/sync/close FOPs in general, when the file is non-trivial
> in size.
>
>>> - Do these interfaces create the files if absent on writes?
>>>
>> Yes, at the present time the code I'm working with creates files if
>> they are absent.
>>
>>> IOW, is this for existing objects/files or to extend the use case into
>>> creating and writing files as objects?
>>>
>> Nothing really stops one from reading/writing an existing file as an
>> object and vice-versa at this time- reading/writing a bunch of small
>> files eg. say 300-400 bytes each should be faster with the object API
>> than reading as files - if something is 1 GB - it probably does not
>> fit the definition of small files but one could work with it
>> atomically if desired - I think the real pain and gain is for
>> applications dealing with lots of small files on Gluster.
>
>
> My concern here is that, I see this work as having to add to the FOP list
> (as you suggested in another mail), so I am thinking how to make best use of
> it, before expanding the same and implementing don't cares or the FOP itself
> in each xlator. Did you have an alternative way of getting this done?
>
>>
>>>>
>>>> The following is what I was thinking - please feel free to correct me
>>>> or guide me if someone has already done some ground work on this.
>>>>
>>>> For read, multiple objects can be provided and they should be
>>>> separated for read from appropriate brick based on the DHT flag - this
>>>> will help avoid multiple lookups from all servers. In the absence of
>>>> DHT they would be sent to all but only the ones that contain the
>>>> object respond (it's more like a multiple file lookup request).
>>>
>>>
>>>
>>> The above section for me is sketchy in details, but the following
>>> questions
>>> do crop up,
>>> - What do you mean by "separated for read from appropriate brick based on
>>> the DHT flag"?
>>>
>> Haven't given this much thought on this after that - presently trying
>> to get the write these working with just 1 brick.
>>>
>>>
>>>
>>> There was a mention of writing a feature page for this enhancement, I
>>> would
>>> suggest doing that, even if premature, so that details are better
>>> elaborated
>>> and understood (by me at least).
>>>
>> If someone can tell me how to get content into the feature page for
>> the libgfapi object API - I would be happy to post details, get
>> feedback and put WIP status.
>
>
> Hmmm... create an account in, http://www.gluster.org
> and, check the feature template in,
> http://www.gluster.org/community/documentation/index.php/Planning37#Proposing_New_Features
> to help with this.
>
> Or other forms like Google docs. text files to the devel list are viable
> options.
>
> Shyam

-- 
-Siva