[Gluster-devel] Feature help

Shyam srangana at redhat.com
Tue Nov 11 18:59:31 UTC 2014


On 11/11/2014 06:40 AM, Rudra Siva wrote:
> Responses inline ... (removed some of my older parts of post).
>
> On Mon, Nov 10, 2014 at 2:11 PM, Shyam <srangana at redhat.com> wrote:
>> On 11/01/2014 10:20 AM, Rudra Siva wrote:
>>>
>>
>> The response below is based on, reading into this mail and the other mail
>> that you sent titled "libgfapi object api" which I believe expands on that
>> actual APIs that you are thinking of. (the following commentary is to glean
>> more information, as this is something that can help small file performance,
>> and could be the result of my own misunderstanding :) )
>>
>> - Who are the consumers of such an API?
>> The way I see it, FUSE does not have a direct way to use this enhancement,
>> unless we think of ways, like the ones that Ben proposed to defer and detect
>> small file creates.
>>
>> Neither does NFS or SMB protocol implementations.
>>
>> Swift has a use case here, as they need to put/get objects atomically and
>> can have a good benefit of having a single API rather than plough through
>> multiple ones and ensuring atomicity using renames (again stated by Ben in
>> the other mail). BUT, we cannot have the entire object that we need to write
>> and then invoke the API (consider an object 1GB in size). So instead Swift
>> would have to use this when it has an entire object and do some optimization
>> like the ones suggested for FUSE like Ben.
>>
>> Hence the question, who are the consumers of this API?
>>
> It should help applications and developers that are reading/writing
> small files which they know are small files to begin with. They may
> know they are dealing with a small files (by way of size, location
> etc.) - swift is one example application but in the real world there
> are probably many others - and they should benefit (at-least that's
> the intent).
>

Agreed, do you have an example of such an application in mind? It helps 
to design the API if we have multiple examples, so that it can be as 
generic as needed.

For example, Swift *put*, streams data to the object store (I use terms 
loosely here), so the caller of this API would not wait to accumulate 
the write stream, for the atomic nature of the current suggested write. 
Rather I would want to stream write (if possible) to be atomic, that can 
handle the create/open/write/sync/close FOPs in general, when the file 
is non-trivial in size.

>> - Do these interfaces create the files if absent on writes?
>>
> Yes, at the present time the code I'm working with creates files if
> they are absent.
>
>> IOW, is this for existing objects/files or to extend the use case into
>> creating and writing files as objects?
>>
> Nothing really stops one from reading/writing an existing file as an
> object and vice-versa at this time- reading/writing a bunch of small
> files eg. say 300-400 bytes each should be faster with the object API
> than reading as files - if something is 1 GB - it probably does not
> fit the definition of small files but one could work with it
> atomically if desired - I think the real pain and gain is for
> applications dealing with lots of small files on Gluster.

My concern here is that, I see this work as having to add to the FOP 
list (as you suggested in another mail), so I am thinking how to make 
best use of it, before expanding the same and implementing don't cares 
or the FOP itself in each xlator. Did you have an alternative way of 
getting this done?

>
>>>
>>> The following is what I was thinking - please feel free to correct me
>>> or guide me if someone has already done some ground work on this.
>>>
>>> For read, multiple objects can be provided and they should be
>>> separated for read from appropriate brick based on the DHT flag - this
>>> will help avoid multiple lookups from all servers. In the absence of
>>> DHT they would be sent to all but only the ones that contain the
>>> object respond (it's more like a multiple file lookup request).
>>
>>
>> The above section for me is sketchy in details, but the following questions
>> do crop up,
>> - What do you mean by "separated for read from appropriate brick based on
>> the DHT flag"?
>>
> Haven't given this much thought on this after that - presently trying
> to get the write these working with just 1 brick.
>>
>>
>> There was a mention of writing a feature page for this enhancement, I would
>> suggest doing that, even if premature, so that details are better elaborated
>> and understood (by me at least).
>>
> If someone can tell me how to get content into the feature page for
> the libgfapi object API - I would be happy to post details, get
> feedback and put WIP status.

Hmmm... create an account in, http://www.gluster.org
and, check the feature template in, 
http://www.gluster.org/community/documentation/index.php/Planning37#Proposing_New_Features 
to help with this.

Or other forms like Google docs. text files to the devel list are viable 
options.

Shyam


More information about the Gluster-devel mailing list