[Gluster-devel] Some questions about requisites of translators

Amar Tumballi amarts at redhat.com
Wed Jun 6 08:08:36 UTC 2012


Below are my thoughts about arguments in case of 'op_ret == -1':

>>     2. Are translators required to propagate callback arguments even
>>     if the result of the operation is an error ? and if an internal
>>     translator error occurs ?
>> Usually no. If op_ret is -1, only op_errno is expected to be a usable
>> value. Rest of the callback parameters are junk.

This is the behavior we followed till now.

>>     When a translator has multiple subvolumes, I've seen that some
>>     arguments, such as xdata, are replaced with NULL. This can be
>>     understood, but are regular translators (those that only have one
>>     subvolume) allowed to do that or must they preserve the value of
>>     xdata, even in the case of an internal error ?
>> It is best to preserve the arguments unless you know specifically what
>> you are doing. In case of error, all the non-op_{ret,errno} arguments
>> are typically junk, including xdata.
>>     If this is not a requisite, xdata loses it's function of
>>     delivering back extra information.
>> Can you explain? Are you seeing a use case for having a valid xdata in
>> the callback even with op_ret == -1?
> As a part of a translator that I'm developing that works with multiple
> subvolumes, I need to implement some healing support to mantain data
> coherency (similar to AFR). After some thought, I decided that it could
> be advantageous to use a dedicated healing translator located near the
> bottom of the translators stack on the servers. This translator won't
> work by itself, it only adds support to be used by a higher level
> translator, which have to manage the logic of the healing and decide
> when a node needs to be healed.
> To do this, sometimes I need to return an error because an operation
> cannot be completed due to some condition related with healing itself
> (not with the underlying storage). However I need to send some specific
> healing information to let the upper translator know how it has to
> handle the detected condition.
> I cannot send a success answer because intermediate translators could
> take the fake data as valid and they could begin to operate incorrectly
> or even create inconsistencies. The other alternative is to use op_errno
> to encode the extra data, but this will also be difficult, even
> impossible in some cases, due to the amount of data and the complexity
> to combine it with an error code without mislead intermediate
> translators with strange or invalid error codes.
> I talked with John Mark about this translator and he suggested me to
> discuss it over the list. Therefore I'll initiate another thread to
> expose in more detail how it works and I would appreciate very much your
> opinion, and that of the other developers, about it. Especially if it
> can really be faster/safer that other solutions or not, or if you find
> any problem or have any suggestion to improve it. I think it could also
> be used by AFR and any future translator that may need some healing
> capabilities.

XDATA is not used in all possible way it can be used as of today. In few
fops, it did replace 'dict' which we used to pass on wire (like
lookup/create etc). Hence the behavior of treating xdata as 'junk' is
the case with fop APIs.

Thinking of the possibilities of utilizing the true value of xdata, it
makes sense to have it as valid (NULL is valid, junk is not) in _cbk() APIs.

one of the use cases I could think for valid xdata in case of op_ret==-1
is, message-id framework, where if the op_ret -1 is set from internal
translators, we can get the exact reason for it along with errno.

Avati/Jeff/Vijay, what you guys think? This can be one of the guide line
about what are the arguments to fop functions and its _cbk(), and what
would the value when its -1 (something similar to valid postparent in
case of lookup_cbk even when op_ret is -1).

Development effort wise, this needs full code review on all the
STACK_UNWIND arguments and making sure no one is sending 'junk' in case
of errors, and some changes in server and client protocol.


More information about the Gluster-devel mailing list