[Gluster-devel] Possible bug in the communications layer ?

Mon May 9 11:07:04 UTC 2016

I've uploaded a patch for this problem:

http://review.gluster.org/14270

Any review will be very appreciated :)

Thanks,

Xavi

On 09/05/16 12:35, Raghavendra Gowdappa wrote:
>
>
> ----- Original Message -----
>> From: "Xavier Hernandez" <xhernandez at datalab.es>
>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>> Sent: Monday, May 9, 2016 3:07:16 PM
>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
>>
>> Hi Raghavendra,
>>
>> I've finally found the bug. It was obvious but I didn't see it.
>
> Same here :).
>
>>
>>   1561         case SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT:
>>   1562                 default_read_size = xdr_sizeof ((xdrproc_t)
>> xdr_gfs3_read_rsp,
>>   1563                                                 &read_rsp);
>>   1564
>>   1565                 proghdr_buf = frag->fragcurrent;
>>   1566
>>   1567                 __socket_proto_init_pending (priv,
>> default_read_size);
>>   1568
>>   1569                 frag->call_body.reply.accepted_success_state
>>   1570                         = SP_STATE_READING_PROC_HEADER;
>>   1571
>>   1572                 /* fall through */
>>   1573
>>   1574         case SP_STATE_READING_PROC_HEADER:
>>   1575                 __socket_proto_read (priv, ret);
>>   1576
>>   1577                 gf_trace_add("xdrmem_create", default_read_size,
>> (uintptr_t)proghdr_buf);
>>   1578                 /* there can be 'xdata' in read response, figure
>> it out */
>>   1579                 xdrmem_create (&xdr, proghdr_buf, default_read_size,
>>   1580                                XDR_DECODE);
>>   1581
>>   1582                 /* This will fail if there is xdata sent from
>> server, if not,
>>   1583                    well and good, we don't need to worry about  */
>>   1584                 xdr_gfs3_read_rsp (&xdr, &read_rsp);
>>   1585
>>   1586                 free (read_rsp.xdata.xdata_val);
>>   1587
>>   1588                 /* need to round off to proper roof (%4), as XDR
>> packing pads
>>   1589                    the end of opaque object with '0' */
>>   1590                 size = roof (read_rsp.xdata.xdata_len, 4);
>>   1591
>>   1592                 if (!size) {
>>   1593                         frag->call_body.reply.accepted_success_state
>>   1594                                 = SP_STATE_READ_PROC_OPAQUE;
>>   1595                         goto read_proc_opaque;
>>   1596                 }
>>   1597
>>   1598                 __socket_proto_init_pending (priv, size);
>>   1599
>>   1600                 frag->call_body.reply.accepted_success_state
>>   1601                         = SP_STATE_READING_PROC_OPAQUE;
>>
>> The main problem here is that we are using two local variables
>> (proghdr_buf and default_read_size) in two distinct states that might be
>> called at different times.
>>
>> The particular case that is failing is the following:
>>
>> 1. In state SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT, everything is prepared
>> to read 116 bytes. default_read_size is set to 116 and proghdr_buf
>> points to the buffer where data will be written.
>>
>> 2. In state SP_STATE_READING_PROC_HEADER, a partial read of 88 bytes is
>> done. At this point the function returns and proghdr_buf and
>> default_read_size are lost.
>>
>> 3. When more data is available, this function is called again and it
>> starts executing at state SP_STATE_READING_PROC_HEADER.
>>
>> 4. The remaining 28 bytes are read.
>>
>> 5. When it checks the buffer and tries to decode it to see if there's
>> xdata present, it uses the default values of proghdr_buf and
>> default_read_size, that are 0. This causes the decode to leave
>> read_rsp.xdata.xdata_len set to 0.
>>
>> 6. The program interprets that xdata_len being 0 means that there's no
>> xdata, so it continues reading the remaining of the RPC packet into the
>> payload buffer.
>>
>> If you want, I can send a patch for this.
>
> Yes. That would be helpful. The analysis is correct and moving initialization of prog_hdrbuf to line 1578 will fix the issue. If you are too busy, please let me know and I can patch it up too :).
>
> Thanks for debugging the issue :).
>
> regards,
> Raghavendra.
>
>>
>> Xavi
>>
>> On 05/05/16 10:21, Xavier Hernandez wrote:
>>> I've undone all changes and now I'm unable to reproduce the problem, so
>>> the modification I did is probably incorrect and not the root cause, as
>>> you described.
>>>
>>> I'll continue investigating...
>>>
>>> Xavi
>>>
>>> On 04/05/16 15:01, Xavier Hernandez wrote:
>>>> On 04/05/16 14:47, Raghavendra Gowdappa wrote:
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>>>> Sent: Wednesday, May 4, 2016 5:37:56 PM
>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
>>>>>>
>>>>>> I think I've found the problem.
>>>>>>
>>>>>> 1567         case SP_STATE_READING_PROC_HEADER:
>>>>>>   1568                 __socket_proto_read (priv, ret);
>>>>>>   1569
>>>>>>   1570                 /* there can be 'xdata' in read response, figure
>>>>>> it out */
>>>>>>   1571                 xdrmem_create (&xdr, proghdr_buf,
>>>>>> default_read_size,
>>>>>>   1572                                XDR_DECODE);
>>>>>>   1573
>>>>>>   1574                 /* This will fail if there is xdata sent from
>>>>>> server, if not,
>>>>>>   1575                    well and good, we don't need to worry
>>>>>> about  */
>>>>>>   1576                 xdr_gfs3_read_rsp (&xdr, &read_rsp);
>>>>>>   1577
>>>>>>   1578                 free (read_rsp.xdata.xdata_val);
>>>>>>   1579
>>>>>>   1580                 /* need to round off to proper roof (%4), as XDR
>>>>>> packing pads
>>>>>>   1581                    the end of opaque object with '0' */
>>>>>>   1582                 size = roof (read_rsp.xdata.xdata_len, 4);
>>>>>>   1583
>>>>>>   1584                 if (!size) {
>>>>>>   1585
>>>>>> frag->call_body.reply.accepted_success_state
>>>>>>   1586                                 = SP_STATE_READ_PROC_OPAQUE;
>>>>>>   1587                         goto read_proc_opaque;
>>>>>>   1588                 }
>>>>>>   1589
>>>>>>   1590                 __socket_proto_init_pending (priv, size);
>>>>>>   1591
>>>>>>   1592                 frag->call_body.reply.accepted_success_state
>>>>>>   1593                         = SP_STATE_READING_PROC_OPAQUE;
>>>>>>   1594
>>>>>>   1595         case SP_STATE_READING_PROC_OPAQUE:
>>>>>>   1596                 __socket_proto_read (priv, ret);
>>>>>>   1597
>>>>>>   1598                 frag->call_body.reply.accepted_success_state
>>>>>>   1599                         = SP_STATE_READ_PROC_OPAQUE;
>>>>>>
>>>>>> On line 1568 we read, at most, 116 bytes because we calculate the size
>>>>>> of a read response without xdata. Then we detect that we really need
>>>>>> more data for xdata (BTW, read_rsp.xdata.xdata_val will be always
>>>>>> allocated even if xdr_gfs3_read_rsp() fails ?)
>>>>>
>>>>> No. It need not be. Its guaranteed that only on a successful
>>>>> completion it is allocated. However, _if_ decoding fails only because
>>>>> xdr stream doesn't include xdata bits, but xdata_len is zero (by
>>>>> initializing it to default_read_size), then xdr library would've
>>>>> filled read_rsp.xdata.xdata_len (read_rsp.xdata.xdata_val can still be
>>>>> NULL).
>>>>
>>>> The question is: is it guaranteed that after an unsuccessful completion
>>>> xdata_val will be NULL (i.e. not touched by the function, even if
>>>> xadata_len is != 0) ? otherwise the free() could corrupt memory.
>>>>
>>>>>
>>>>>>
>>>>>> So we get into line 1596 with the pending info initialized to read the
>>>>>> remaining data. This is the __socket_proto_read macro:
>>>>>>
>>>>>>    166 /* This will be used in a switch case and breaks from the switch
>>>>>> case if all
>>>>>>    167  * the pending data is not read.
>>>>>>    168  */
>>>>>>    169 #define __socket_proto_read(priv, ret)
>>>>>>        \
>>>>>>    170                 {
>>>>>>        \
>>>>>>    171                 size_t bytes_read = 0;
>>>>>>        \
>>>>>>    172                 struct gf_sock_incoming *in;
>>>>>>        \
>>>>>>    173                 in = &priv->incoming;
>>>>>>        \
>>>>>>    174
>>>>>>        \
>>>>>>    175                 __socket_proto_update_pending (priv);
>>>>>>        \
>>>>>>    176
>>>>>>        \
>>>>>>    177                 ret = __socket_readv (this,
>>>>>>        \
>>>>>>    178                                       in->pending_vector, 1,
>>>>>>        \
>>>>>>    179                                       &in->pending_vector,
>>>>>>        \
>>>>>>    180                                       &in->pending_count,
>>>>>>        \
>>>>>>    181                                       &bytes_read);
>>>>>>        \
>>>>>>    182                 if (ret == -1)
>>>>>>        \
>>>>>>    183                         break;
>>>>>>        \
>>>>>>    184                 __socket_proto_update_priv_after_read (priv,
>>>>>> ret,
>>>>>> bytes_read); \
>>>>>>    185         }
>>>>>>
>>>>>> We read from the socket using __socket_readv(). It it fails, we quit.
>>>>>> However if the socket doesn't have more data to read, this function
>>>>>> does
>>>>>> not return -1:
>>>>>>
>>>>>>    555                         ret = __socket_cached_read (this,
>>>>>> opvector, opcount);
>>>>>>    556
>>>>>>    557                         if (ret == 0) {
>>>>>>    558
>>>>>> gf_log(this->name,GF_LOG_DEBUG,"EOF on socket");
>>>>>>    559                                 errno = ENODATA;
>>>>>>    560                                 ret = -1;
>>>>>>    561                         }
>>>>>>    562                         if (ret == -1 && errno == EAGAIN) {
>>>>>>    563                                 /* done for now */
>>>>>>    564                                 break;
>>>>>>    565                         }
>>>>>>    566                         this->total_bytes_read += ret;
>>>>>>
>>>>>> If __socket_cached_read() fails with errno == EAGAIN, we break and
>>>>>> return opcount, which is >= 0. Causing the process to continue instead
>>>>>> of waiting for more data.
>>>>>
>>>>> No. If you observe, there is a call to another macro
>>>>> __socket_proto_update_priv_after_read at line 184. As can be seen in
>>>>> the definition below, if (ret > 0) - which is the case in case of
>>>>> short-read - we do break. I hope I am not missing anything here :).
>>>>>
>>>>> #define __socket_proto_update_priv_after_read(priv, ret,
>>>>> bytes_read)    \
>>>>>
>>>>> {                                                               \
>>>>>         struct gf_sock_incoming_frag *frag;                     \
>>>>>                 frag =
>>>>> &priv->incoming.frag;                            \
>>>>>
>>>>> \
>>>>>                 frag->fragcurrent +=
>>>>> bytes_read;                        \
>>>>>                 frag->bytes_read +=
>>>>> bytes_read;                         \
>>>>>
>>>>> \
>>>>>                 if ((ret > 0) || (frag->remaining_size != 0))
>>>>> {         \
>>>>>             if (frag->remaining_size != 0 && ret == 0) {    \
>>>>>                 __socket_proto_reset_pending (priv);    \
>>>>>
>>>>> }                                               \
>>>>>
>>>>> \
>>>>>                         gf_log (this->name,
>>>>> GF_LOG_TRACE,               \
>>>>>                                 "partial read on non-blocking
>>>>> socket"); \
>>>>>
>>>>> \
>>>>>
>>>>> break;                                          \
>>>>>
>>>>> }                                                       \
>>>>>         }
>>>>>
>>>>
>>>> I didn't see this. However the fact is that doing the change I've
>>>> described, it seems to work.
>>>>
>>>> I'll analyze it a bit more.
>>>>
>>>> Xavi
>>>>
>>>>>>
>>>>>> As a side note, there's another problem here: if errno is not EAGAIN,
>>>>>> we'll update this->total_bytes_read substracting one. This shouldn't be
>>>>>> done when ret < 0.
>>>>>>
>>>>>> There are other places where ret is set to -1, but opcount is returned.
>>>>>> I guess that we should also set opcount = -1 on these places, but I
>>>>>> don't have a deep knowledge about this implementation.
>>>>>>
>>>>>> I've done a quick test checking for (ret != 0) instead of (ret ==
>>>>>> -1) in
>>>>>> __socket_proto_read() and it seemed to work.
>>>>>>
>>>>>> Could anyone with more knowledge about the communications layer verify
>>>>>> this and explain what would be the best solution ?
>>>>>>
>>>>>> Xavi
>>>>>>
>>>>>> On 29/04/16 14:52, Xavier Hernandez wrote:
>>>>>>> With your patch applied, it seems that the bug is not hit.
>>>>>>>
>>>>>>> I guess it's a timing issue that the new logging hides. Maybe no more
>>>>>>> data available after reading the partial readv header ? (it will
>>>>>>> arrive
>>>>>>> later).
>>>>>>>
>>>>>>> I'll continue testing...
>>>>>>>
>>>>>>> Xavi
>>>>>>>
>>>>>>> On 29/04/16 13:48, Raghavendra Gowdappa wrote:
>>>>>>>> Attaching the patch.
>>>>>>>>
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>>>>>>> Sent: Friday, April 29, 2016 5:14:02 PM
>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
>>>>>>>>> layer ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>>>>>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>>>>>>>> Sent: Friday, April 29, 2016 1:21:57 PM
>>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
>>>>>>>>>> layer ?
>>>>>>>>>>
>>>>>>>>>> Hi Raghavendra,
>>>>>>>>>>
>>>>>>>>>> yes, the readv response contains xdata. The dict length is 38
>>>>>>>>>> (0x26)
>>>>>>>>>> and, at the moment of failure, rsp.xdata.xdata_len already contains
>>>>>>>>>> 0x26.
>>>>>>>>>
>>>>>>>>> rsp.xdata.xdata_len having 0x26 even when decoding failed indicates
>>>>>>>>> that the
>>>>>>>>> approach used in socket.c to get the length of xdata is correct.
>>>>>>>>> However, I
>>>>>>>>> cannot find any other way of xdata going into payload vector other
>>>>>>>>> than
>>>>>>>>> xdata_len being zero. Just to be double sure, I've a patch
>>>>>>>>> containing
>>>>>>>>> debug
>>>>>>>>> message printing xdata_len when decoding fails in socket.c. Can you
>>>>>>>>> please
>>>>>>>>> apply the patch, run the tests and revert back with results?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Xavi
>>>>>>>>>>
>>>>>>>>>> On 29/04/16 09:10, Raghavendra Gowdappa wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>>>>>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>>>>>>>>>> Sent: Friday, April 29, 2016 12:36:43 PM
>>>>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
>>>>>>>>>>>> layer ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>>>>>>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>>>>>>>>> Cc: "Jeff Darcy" <jdarcy at redhat.com>, "Gluster Devel"
>>>>>>>>>>>>> <gluster-devel at gluster.org>
>>>>>>>>>>>>> Sent: Friday, April 29, 2016 12:07:59 PM
>>>>>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
>>>>>>>>>>>>> layer ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>>>>>>>>>> To: "Jeff Darcy" <jdarcy at redhat.com>
>>>>>>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>>>>>>>>>>>> Sent: Thursday, April 28, 2016 8:15:36 PM
>>>>>>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
>>>>>>>>>>>>>> layer
>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Jeff,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 28.04.2016 15:20, Jeff Darcy wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This happens with Gluster 3.7.11 accessed through Ganesha and
>>>>>>>>>>>>>> gfapi.
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> volume is a distributed-disperse 4*(4+2). I'm able to
>>>>>>>>>>>>>> reproduce the
>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>> easily doing the following test: iozone -t2 -s10g -r1024k
>>>>>>>>>>>>>> -i0 -w
>>>>>>>>>>>>>> -F/iozone{1..2}.dat echo 3 >/proc/sys/vm/drop_caches iozone -t2
>>>>>>>>>>>>>> -s10g
>>>>>>>>>>>>>> -r1024k -i1 -w -F/iozone{1..2}.dat The error happens soon after
>>>>>>>>>>>>>> starting
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> read test. As can be seen in the data below,
>>>>>>>>>>>>>> client3_3_readv_cbk() is
>>>>>>>>>>>>>> processing an iovec of 116 bytes, however it should be of 154
>>>>>>>>>>>>>> bytes
>>>>>>>>>>>>>> (the
>>>>>>>>>>>>>> buffer in memory really seems to contain 154 bytes). The
>>>>>>>>>>>>>> data on
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> network
>>>>>>>>>>>>>> seems ok (at least I haven't been able to identify any
>>>>>>>>>>>>>> problem), so
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>> must be a processing error on the client side. The last
>>>>>>>>>>>>>> field in
>>>>>>>>>>>>>> cut
>>>>>>>>>>>>>> buffer
>>>>>>>>>>>>>> of the sequentialized data corresponds to the length of the
>>>>>>>>>>>>>> xdata
>>>>>>>>>>>>>> field:
>>>>>>>>>>>>>> 0x26. So at least 38 more byte should be present.
>>>>>>>>>>>>>> Nice detective work, Xavi.  It would be *very* interesting to
>>>>>>>>>>>>>> see what
>>>>>>>>>>>>>> the value of the "count" parameter is (it's unfortunately
>>>>>>>>>>>>>> optimized
>>>>>>>>>>>>>> out).
>>>>>>>>>>>>>> I'll bet it's two, and iov[1].iov_len is 38.  I have a weak
>>>>>>>>>>>>>> memory of
>>>>>>>>>>>>>> some problems with how this iov is put together, a couple of
>>>>>>>>>>>>>> years
>>>>>>>>>>>>>> ago,
>>>>>>>>>>>>>> and it looks like you might have tripped over one more.
>>>>>>>>>>>>>> It seems you are right. The count is 2 and the first 38
>>>>>>>>>>>>>> bytes of
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> second
>>>>>>>>>>>>>> vector contains the remaining data of xdata field.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is the bug. client3_3_readv_cbk (and for that matter all
>>>>>>>>>>>>> the
>>>>>>>>>>>>> actors/cbks) expects response in utmost two vectors:
>>>>>>>>>>>>> 1. Program header containing request or response. This is
>>>>>>>>>>>>> subjected to
>>>>>>>>>>>>> decoding/encoding. This vector should point to a buffer that
>>>>>>>>>>>>> contains
>>>>>>>>>>>>> the
>>>>>>>>>>>>> entire program header/response contiguously.
>>>>>>>>>>>>> 2. If the procedure returns payload (like readv response or a
>>>>>>>>>>>>> write
>>>>>>>>>>>>> request),
>>>>>>>>>>>>> second vector contains the buffer pointing to the entire
>>>>>>>>>>>>> (contiguous)
>>>>>>>>>>>>> payload. Note that this payload is raw and is not subjected to
>>>>>>>>>>>>> encoding/decoding.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In your case, this _clean_ separation is broken with part of
>>>>>>>>>>>>> program
>>>>>>>>>>>>> header
>>>>>>>>>>>>> slipping into 2nd vector supposed to contain read data (may be
>>>>>>>>>>>>> because
>>>>>>>>>>>>> of
>>>>>>>>>>>>> rpc fragmentation). I think this is a bug in socket layer. I'll
>>>>>>>>>>>>> update
>>>>>>>>>>>>> more
>>>>>>>>>>>>> on this.
>>>>>>>>>>>>
>>>>>>>>>>>> Does your read response include xdata too? I think the code
>>>>>>>>>>>> related to
>>>>>>>>>>>> reading xdata in readv response is a bit murky.
>>>>>>>>>>>>
>>>>>>>>>>>> <socket.c/__socket_read_accepted_successful_reply>
>>>>>>>>>>>>
>>>>>>>>>>>>         case SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT:
>>>>>>>>>>>>                 default_read_size = xdr_sizeof ((xdrproc_t)
>>>>>>>>>>>>                 xdr_gfs3_read_rsp,
>>>>>>>>>>>>                                                 &read_rsp);
>>>>>>>>>>>>
>>>>>>>>>>>>                 proghdr_buf = frag->fragcurrent;
>>>>>>>>>>>>
>>>>>>>>>>>>                 __socket_proto_init_pending (priv,
>>>>>>>>>>>> default_read_size);
>>>>>>>>>>>>
>>>>>>>>>>>>                 frag->call_body.reply.accepted_success_state
>>>>>>>>>>>>                         = SP_STATE_READING_PROC_HEADER;
>>>>>>>>>>>>
>>>>>>>>>>>>                 /* fall through */
>>>>>>>>>>>>
>>>>>>>>>>>>         case SP_STATE_READING_PROC_HEADER:
>>>>>>>>>>>>                 __socket_proto_read (priv, ret);
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> By this time we've read read response _minus_ the xdata
>>>>>>>>>>>
>>>>>>>>>>> I meant we have read "readv response header"
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                 /* there can be 'xdata' in read response, figure
>>>>>>>>>>>> it out
>>>>>>>>>>>>                 */
>>>>>>>>>>>>                 xdrmem_create (&xdr, proghdr_buf,
>>>>>>>>>>>> default_read_size,
>>>>>>>>>>>>                                XDR_DECODE);
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We created xdr stream above with "default_read_size" (this
>>>>>>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>>>>> include xdata)
>>>>>>>>>>>>
>>>>>>>>>>>>                 /* This will fail if there is xdata sent from
>>>>>>>>>>>> server, if
>>>>>>>>>>>>                 not,
>>>>>>>>>>>>                    well and good, we don't need to worry
>>>>>>>>>>>> about  */
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> what if xdata is present and decoding failed (as length
>>>>>>>>>>>>>>>>>> of xdr
>>>>>>>>>>>>>>>>>> stream
>>>>>>>>>>>>>>>>>> above - default_read_size - doesn't include xdata)?
>>>>>>>>>>>>>>>>>> would we
>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> valid value in read_rsp.xdata.xdata_len? This is the part
>>>>>>>>>>>>>>>>>> I am
>>>>>>>>>>>>>>>>>> confused about. If read_rsp.xdata.xdata_len is not
>>>>>>>>>>>>>>>>>> correct then
>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>> is a possibility that xdata might not be entirely
>>>>>>>>>>>>>>>>>> present in
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> vector socket passes to higher layers as progheader (with
>>>>>>>>>>>>>>>>>> part or
>>>>>>>>>>>>>>>>>> entire xdata spilling over to payload vector).
>>>>>>>>>>>>
>>>>>>>>>>>>                 xdr_gfs3_read_rsp (&xdr, &read_rsp);
>>>>>>>>>>>>
>>>>>>>>>>>>                 free (read_rsp.xdata.xdata_val);
>>>>>>>>>>>>
>>>>>>>>>>>>                 /* need to round off to proper roof (%4), as XDR
>>>>>>>>>>>> packing
>>>>>>>>>>>>                 pads
>>>>>>>>>>>>                    the end of opaque object with '0' */
>>>>>>>>>>>>                 size = roof (read_rsp.xdata.xdata_len, 4);
>>>>>>>>>>>>
>>>>>>>>>>>>                 if (!size) {
>>>>>>>>>>>>
>>>>>>>>>>>> frag->call_body.reply.accepted_success_state
>>>>>>>>>>>>                                 = SP_STATE_READ_PROC_OPAQUE;
>>>>>>>>>>>>                         goto read_proc_opaque;
>>>>>>>>>>>>                 }
>>>>>>>>>>>>
>>>>>>>>>>>> </socket.c>
>>>>>>>>>>>>
>>>>>>>>>>>> Can you please confirm whether there was an xdata in the readv
>>>>>>>>>>>> response
>>>>>>>>>>>> (may
>>>>>>>>>>>> be by looking in bricks) whose decoding failed?
>>>>>>>>>>>>
>>>>>>>>>>>> regards,
>>>>>>>>>>>> Raghavendra
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Gluster-devel mailing list
>>>>>>>>>>>> Gluster-devel at gluster.org
>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-devel mailing list
>>>>>>> Gluster-devel at gluster.org
>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>