[Gluster-devel] RFC/Review: libgfapi object handle based extensions

Mon Sep 30 05:08:59 UTC 2013

Also note that the same glfs_object must be re-used in readdirplus (once we
have a _h_ equivalent of the API)

Avati

On Sun, Sep 29, 2013 at 10:05 PM, Anand Avati <avati at gluster.org> wrote:

> I see a pretty core issue - lifecycle management of 'struct glfs_object'.
> What is the structure representing? When is it created? When is it
> destroyed? How does it relate to inode_t?
>
> Looks like for every lookup() we are creating a new glfs_object, even if
> the looked up inode was already looked up before (in the cache) and had a
> glfs_object created for it in the recent past.
>
> We need a stronger relationship between the two with a clearer
> relationship. It is probably necessary for a glfs_object to represent
> mulitple inode_t's at different points in time depending on graph switches,
> but for a given inode_t we need only one glfs_object. We definitely must
> NOT have a new glfs_object per lookup call.
>
> Avati
>
> On Thu, Sep 19, 2013 at 5:13 AM, Shyamsundar Ranganathan <
> srangana at redhat.com> wrote:
>
>> Avati,
>>
>> Please find the updated patch set for review at gerrit.
>> http://review.gluster.org/#/c/5936/
>>
>> Changes made to address the points (1) (2) and (3) below. By the usage of
>> the suggested glfs_resolve_inode approach.
>>
>> I have not yet changes glfs_h_unlink to use the glfs_resolve_at. (more on
>> this a little later).
>>
>> So currently, the review request is for all APIs other than,
>> glfs_h_unlink, glfs_h_extract_gfid, glfs_h_create_from_gfid
>>
>> glfs_resolve_at: Using this function the terminal name will be a force
>> look up anyway (as force_lookup will be passed as 1 based on
>> !next_component). We need to avoid this _extra_ lookup in the unlink case,
>> which is why all the inode_grep(s) etc. were added to the glfs_h_lookup in
>> the first place.
>>
>> Having said the above, we should still leverage glfs_resolve_at anyway,
>> as there seem to be other corner cases where the resolved inode and subvol
>> maybe from different graphs. So I think I want to modify glfs_resolve_at to
>> make a conditional force_lookup, based on iatt being NULL or not. IOW,
>> change the call to glfs_resolve_component with the conditional as, (reval
>> || (!next_component && iatt)). So that callers that do not want the iatt
>> filled, can skip the syncop_lookup.
>>
>> Request comments on the glfs_resolve_at proposal.
>>
>> Shyam.
>>
>> ----- Original Message -----
>>
>> > From: "Anand Avati" <avati at gluster.org>
>> > To: "Shyamsundar Ranganathan" <srangana at redhat.com>
>> > Cc: "Gluster Devel" <gluster-devel at nongnu.org>
>> > Sent: Wednesday, September 18, 2013 11:39:27 AM
>> > Subject: Re: RFC/Review: libgfapi object handle based extensions
>>
>> > Minor comments are made in gerrit. Here is a larger (more important)
>> comment
>> > for which email is probably more convenient.
>>
>> > There is a problem in the general pattern of the fops, for example
>> > glfs_h_setattrs() (and others too)
>>
>> > 1. glfs_validate_inode() has the assumption that object->inode deref is
>> a
>> > guarded operation, but here we are doing an unguarded deref in the
>> paramter
>> > glfs_resolve_base().
>>
>> > 2. A more important issue, glfs_active_subvol() and
>> glfs_validate_inode() are
>> > not atomic. glfs_active_subvol() can return an xlator from one graph,
>> but by
>> > the time glfs_validate_inode() is called, a graph switch could have
>> happened
>> > and inode can get resolved to a different graph. And in syncop_XXXXXX()
>> we
>> > end up calling on graph1 with inode belonging to graph2.
>>
>> > 3. ESTALE_RETRY is a fundamentally wrong thing to do with handle based
>> > operations. The ESTALE_RETRY macro exists for path based FOPs where the
>> > resolved handle could have turned stale by the time we perform the FOP
>> > (where resolution and FOP are non-atomic). Over here, the handle is
>> > predetermined, and it does not make sense to retry on ESTALE (notice
>> that FD
>> > based fops in glfs-fops.c also do not have ESTALE_RETRY for this same
>> > reason)
>>
>> > I think the pattern should be similar to FD based fops which
>> specifically
>> > address both the above problems. Here's an outline:
>>
>> > glfs_h_XXXX(struct glfs *fs, glfs_object *object, ...)
>> > {
>> > xlator_t *subvol = NULL;
>> > inode_t *inode = NULL;
>>
>> > __glfs_entry_fs (fs);
>>
>> > subvol = glfs_active_subvol (fs);
>> > if (!subvol) { errno = EIO; ... goto out; }
>>
>> > inode = glfs_resolve_inode (fs, object, subvol);
>> > if (!inode) { errno = ESTALE; ... goto out; }
>>
>> > loc.inode = inode;
>> > ret = syncop_XXXX(subvol, &loc, ...);
>>
>> > }
>>
>> > Notice the signature of glfs_resolve_inode(). What it does: given a
>> > glfs_object, and a subvol, it returns an inode_t which is resolved on
>> that
>> > subvol. This way the syncop_XXX() is performed with matching subvol and
>> > inode. Also it returns the inode pointer so that no unsafe object->inode
>> > deref is done by the caller. Again, this is the same pattern followed
>> by the
>> > fd based fops already.
>>
>> > Also, as mentioned in one of the comments, please consider using
>> > glfs_resolve_at() and avoiding manual construction of loc_t.
>>
>> > Thanks,
>> > Avati
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20130929/a34f9a6b/attachment-0001.html>