[Gluster-devel] compound fop design first cut

Poornima Gurusiddaiah pgurusid at redhat.com
Wed Dec 9 14:33:00 UTC 2015


libgfapi compound fops added inline.

----- Original Message -----
> From: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Wednesday, December 9, 2015 2:18:47 PM
> Subject: Re: [Gluster-devel] compound fop design first cut
> 
> Geo-rep requirements inline.
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> > From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> > To: "Vijay Bellur" <vbellur at redhat.com>, "Jeff Darcy" <jdarcy at redhat.com>,
> > "Raghavendra Gowdappa"
> > <rgowdapp at redhat.com>, "Ira Cooper" <ira at redhat.com>
> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > Sent: Wednesday, December 9, 2015 11:44:52 AM
> > Subject: Re: [Gluster-devel] compound fop design first cut
> > 
> > 
> > 
> > On 12/09/2015 06:37 AM, Vijay Bellur wrote:
> > > On 12/08/2015 03:45 PM, Jeff Darcy wrote:
> > >>
> > >>
> > >>
> > >> On December 8, 2015 at 12:53:04 PM, Ira Cooper (ira at redhat.com) wrote:
> > >>> Raghavendra Gowdappa writes:
> > >>> I propose that we define a "compound op" that contains ops.
> > >>>
> > >>> Within each op, there are fields that can be "inherited" from the
> > >>> previous op, via use of a sentinel value.
> > >>>
> > >>> Sentinel is -1, for all of these examples.
> > >>>
> > >>> So:
> > >>>
> > >>> LOOKUP (1, "foo") (Sets the gfid value to be picked up by
> > >>> compounding, 1
> > >>> is the root directory, as a gfid, by convention.)
> > >>> OPEN(-1, O_RDWR) (Uses the gfid value, sets the glfd compound value.)
> > >>> WRITE(-1, "foo", 3) (Uses the glfd compound value.)
> > >>> CLOSE(-1) (Uses the glfd compound value)
> > >>
> > >> So, basically, what the programming-language types would call futures
> > >> and promises.  It’s a good and well studied concept, which is necessary
> > >> to solve the second-order problem of how to specify an argument in
> > >> sub-operation N+1 that’s not known until sub-operation N completes.
> > >>
> > >> To be honest, some of the highly general approaches suggested here scare
> > >> me too.  Wrapping up the arguments for one sub-operation in xdata for
> > >> another would get pretty hairy if we ever try to go beyond two
> > >> sub-operations and have to nest sub-operation #3’s args within
> > >> sub-operation #2’s xdata which is itself encoded within sub-operation
> > >> #1’s xdata.  There’s also not much clarity about how to handle errors in
> > >> that model.  Encoding N sub-operations’ arguments in a linear structure
> > >> as Shyam proposes seems a bit cleaner that way.  If I were to continue
> > >> down that route I’d suggest just having start_compound and end-compound
> > >> fops, plus an extra field (or by-convention xdata key) that either the
> > >> client-side or server-side translator could use to build whatever
> > >> structure it wants and schedule sub-operations however it wants.
> > >>
> > >> However, I’d be even more comfortable with an even simpler approach that
> > >> avoids the need to solve what the database folks (who have dealt with
> > >> complex transactions for years) would tell us is a really hard problem.
> > >> Instead of designing for every case we can imagine, let’s design for the
> > >> cases that we know would be useful for improving performance. Open plus
> > >> read/write plus close is an obvious one.  Raghavendra mentions
> > >> create+inodelk as well.  For each of those, we can easily define a
> > >> structure that contains the necessary fields, we don’t need a
> > >> client-side translator, and the server-side translator can take care of
> > >> “forwarding” results from one sub-operation to the next.  We could even
> > >> use GF_FOP_IPC to prototype this.  If we later find that the number of
> > >> “one-off” compound requests is growing too large, then at least we’ll
> > >> have some experience to guide our design of a more general alternative.
> > >> Right now, I think we’re trying to look further ahead than we can see
> > >> clearly.
> > Yes Agree. This makes implementation on the client side simpler as well.
> > So it is welcome.
> > 
> > Just updating the solution.
> > 1) New RPCs are going to be implemented.
> > 2) client stack will use these new fops.
> > 3) On the server side we have server xlator implementing these new fops
> > to decode the RPC request then resolve_resume and
> > compound-op-receiver(Better name for this is welcome) which sends one op
> > after other and send compound fop response.
> > 
> > List of compound fops identified so far:
> > Swift/S3:
> > PUT: creat(), write()s, setxattr(), fsync(), close(), rename()
> > 
> > Dht:
> > mkdir + inodelk
> > 
> > Afr:
> > xattrop+writev, xattrop+unlock to begin with.
> 
>   Geo-rep:
>   mknod,entrylk,stat(on backend gfid)
>   mkdir,entrylk,stat (on backend gfid)
>   symlink,entrylk,stat(on backend gfid)
>   
libgfapi :
    glfs_setfsuid, glfs_setfsgid, glfs_setfsgroups, glfs_set_lkowner and leaseid - these are not network fops, hence mostly impact gfapi interface for compound fops.
    open/create + lease + lk
    readir + stat + getxattrs => already being discussed to replace this with readdirplus
    Multiple writes

> > 
> > Could everyone who needs compound fops add to this list?
> > 
> > I see that Niels is back on 14th. Does anyone else know the list of
> > compound fops he has in mind?
> > 
> > Pranith.
> > >
> > > Starting with a well defined set of operations for compounding has its
> > > advantages. It would be easier to understand and maintain correctness
> > > across the stack. Some of our translators perform transactions &
> > > create/update internal metadata for certain fops. It would be easier
> > > for such translators if the compound operations are well defined and
> > > does not entail deep introspection of a generic representation to
> > > ensure that the right behavior gets reflected at the end of a compound
> > > operation.
> > >
> > > -Vijay
> > >
> > >
> > >
> > 
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list