[Gluster-devel] RFC/Review: libgfapi object handle based extensions

Mon Sep 30 05:05:05 UTC 2013

I see a pretty core issue - lifecycle management of 'struct glfs_object'.
What is the structure representing? When is it created? When is it
destroyed? How does it relate to inode_t?

Looks like for every lookup() we are creating a new glfs_object, even if
the looked up inode was already looked up before (in the cache) and had a
glfs_object created for it in the recent past.

We need a stronger relationship between the two with a clearer
relationship. It is probably necessary for a glfs_object to represent
mulitple inode_t's at different points in time depending on graph switches,
but for a given inode_t we need only one glfs_object. We definitely must
NOT have a new glfs_object per lookup call.

Avati

On Thu, Sep 19, 2013 at 5:13 AM, Shyamsundar Ranganathan <
srangana at redhat.com> wrote:

> Avati,
>
> Please find the updated patch set for review at gerrit.
> http://review.gluster.org/#/c/5936/
>
> Changes made to address the points (1) (2) and (3) below. By the usage of
> the suggested glfs_resolve_inode approach.
>
> I have not yet changes glfs_h_unlink to use the glfs_resolve_at. (more on
> this a little later).
>
> So currently, the review request is for all APIs other than,
> glfs_h_unlink, glfs_h_extract_gfid, glfs_h_create_from_gfid
>
> glfs_resolve_at: Using this function the terminal name will be a force
> look up anyway (as force_lookup will be passed as 1 based on
> !next_component). We need to avoid this _extra_ lookup in the unlink case,
> which is why all the inode_grep(s) etc. were added to the glfs_h_lookup in
> the first place.
>
> Having said the above, we should still leverage glfs_resolve_at anyway, as
> there seem to be other corner cases where the resolved inode and subvol
> maybe from different graphs. So I think I want to modify glfs_resolve_at to
> make a conditional force_lookup, based on iatt being NULL or not. IOW,
> change the call to glfs_resolve_component with the conditional as, (reval
> || (!next_component && iatt)). So that callers that do not want the iatt
> filled, can skip the syncop_lookup.
>
> Request comments on the glfs_resolve_at proposal.
>
> Shyam.
>
> ----- Original Message -----
>
> > From: "Anand Avati" <avati at gluster.org>
> > To: "Shyamsundar Ranganathan" <srangana at redhat.com>
> > Cc: "Gluster Devel" <gluster-devel at nongnu.org>
> > Sent: Wednesday, September 18, 2013 11:39:27 AM
> > Subject: Re: RFC/Review: libgfapi object handle based extensions
>
> > Minor comments are made in gerrit. Here is a larger (more important)
> comment
> > for which email is probably more convenient.
>
> > There is a problem in the general pattern of the fops, for example
> > glfs_h_setattrs() (and others too)
>
> > 1. glfs_validate_inode() has the assumption that object->inode deref is a
> > guarded operation, but here we are doing an unguarded deref in the
> paramter
> > glfs_resolve_base().
>
> > 2. A more important issue, glfs_active_subvol() and
> glfs_validate_inode() are
> > not atomic. glfs_active_subvol() can return an xlator from one graph,
> but by
> > the time glfs_validate_inode() is called, a graph switch could have
> happened
> > and inode can get resolved to a different graph. And in syncop_XXXXXX()
> we
> > end up calling on graph1 with inode belonging to graph2.
>
> > 3. ESTALE_RETRY is a fundamentally wrong thing to do with handle based
> > operations. The ESTALE_RETRY macro exists for path based FOPs where the
> > resolved handle could have turned stale by the time we perform the FOP
> > (where resolution and FOP are non-atomic). Over here, the handle is
> > predetermined, and it does not make sense to retry on ESTALE (notice
> that FD
> > based fops in glfs-fops.c also do not have ESTALE_RETRY for this same
> > reason)
>
> > I think the pattern should be similar to FD based fops which specifically
> > address both the above problems. Here's an outline:
>
> > glfs_h_XXXX(struct glfs *fs, glfs_object *object, ...)
> > {
> > xlator_t *subvol = NULL;
> > inode_t *inode = NULL;
>
> > __glfs_entry_fs (fs);
>
> > subvol = glfs_active_subvol (fs);
> > if (!subvol) { errno = EIO; ... goto out; }
>
> > inode = glfs_resolve_inode (fs, object, subvol);
> > if (!inode) { errno = ESTALE; ... goto out; }
>
> > loc.inode = inode;
> > ret = syncop_XXXX(subvol, &loc, ...);
>
> > }
>
> > Notice the signature of glfs_resolve_inode(). What it does: given a
> > glfs_object, and a subvol, it returns an inode_t which is resolved on
> that
> > subvol. This way the syncop_XXX() is performed with matching subvol and
> > inode. Also it returns the inode pointer so that no unsafe object->inode
> > deref is done by the caller. Again, this is the same pattern followed by
> the
> > fd based fops already.
>
> > Also, as mentioned in one of the comments, please consider using
> > glfs_resolve_at() and avoiding manual construction of loc_t.
>
> > Thanks,
> > Avati
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20130929/5ff0562a/attachment-0001.html>