[Gluster-devel] compound fop design first cut

Anuradha Talur atalur at redhat.com
Wed Jan 6 12:29:06 UTC 2016


Hi,

After discussions with Pranith and Soumya, here is the design for compound fops:

1) fops will be compounded per inode, meaning 2 fops on different inodes can't be compounded (Not because of the design, Just reducing scope of the problem).
2) Each xlator that wants a compound fop packs the arguments by itself.
3) On the server side a de-compounder placed below server xlator unpacks the arguments and does the necessary operations.
4) Arguments for compound fops will be passed as array of union of structures where each structure is associated with a fop.
5) Each xlator will have <xlator>_compound_fop () which receives the fop and does additional processing that is required for itself.
6) Response will also be an array of union of response structures where each structure is associated with a fop's response.

Comments welcome!

----- Original Message -----
> From: "Milind Changire" <milindchangire at gmail.com>
> To: "Jeff Darcy" <jdarcy at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Friday, December 11, 2015 9:25:38 PM
> Subject: Re: [Gluster-devel] compound fop design first cut
> 
> 
> 
> On Wed, Dec 9, 2015 at 8:02 PM, Jeff Darcy < jdarcy at redhat.com > wrote:
> 
> 
> 
> 
> 
> On December 9, 2015 at 7:07:06 AM, Ira Cooper ( ira at redhat.com ) wrote:
> > A simple "abort on failure" and let the higher levels clean it up is
> > probably right for the type of compounding I propose. It is what SMB2
> > does. So, if you get an error return value, cancel the rest of the
> > request, and have it return ECOMPOUND as the errno.
> 
> This is exactly the part that worries me. If a compound operation
> fails, some parts of it will often need to be undone. “Let the higher
> levels clean it up” means that rollback code will be scattered among all
> of the translators that use compound operations. Some of them will do
> it right. Others . . . less so. ;) All willl have to be tested
> separately. If we centralize dispatch of compound operations into one
> piece of code, we can centralize error detection and recovery likewise.
> That ensures uniformity of implementation, and facilitates focused
> testing (or even formal proof) of that implementation.
> 
> Can we gain the same benefits with a more generic design? Perhaps. It
> would require that the compounding translator know how to reverse each
> type of operation, so that it can do so after an error. That’s
> feasible, though it does mean maintaining a stack of undo actions
> instead of a simple state. It might also mean testing combinations and
> scenarios that will actually never occur in other components’ usage of
> the compounding feature. More likely it means that people will *think*
> they can use the facility in unanticipated ways, until their
> unanticipated usage creates a combination or scenario that was never
> tested and doesn’t work. Those are going to be hard problems to debug.
> I think it’s better to be explicit about which permutations we actually
> expect to work, and have those working earlier.
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> 
> 
> Could we have a dry-run phase and a commit phase for the compound operation.
> The dry-run phase phase could test the validity of the transaction and the
> commit phase can actually perform the operation.
> 
> If any of the operation in the dry-run operation sequence returns error, the
> compound operation can be aborted immediately without the complexity of an
> undo ... scattered or centralized.
> 
> But if the subsequent operations depend on the changed state of the system
> from earlier operations, then we'll have to introduce a system state object
> for such transactions ... and maybe serialize such operations. The system
> state object can be passed through the operation sequence. How well this
> idea would work in a multi-threaded world is not clear to me too.
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

-- 
Thanks,
Anuradha.


More information about the Gluster-devel mailing list