[Gluster-devel] Possible bug in the communications layer ?
Raghavendra Gowdappa
rgowdapp at redhat.com
Mon May 9 10:35:54 UTC 2016
----- Original Message -----
> From: "Xavier Hernandez" <xhernandez at datalab.es>
> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Monday, May 9, 2016 3:07:16 PM
> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
>
> Hi Raghavendra,
>
> I've finally found the bug. It was obvious but I didn't see it.
Same here :).
>
> 1561 case SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT:
> 1562 default_read_size = xdr_sizeof ((xdrproc_t)
> xdr_gfs3_read_rsp,
> 1563 &read_rsp);
> 1564
> 1565 proghdr_buf = frag->fragcurrent;
> 1566
> 1567 __socket_proto_init_pending (priv,
> default_read_size);
> 1568
> 1569 frag->call_body.reply.accepted_success_state
> 1570 = SP_STATE_READING_PROC_HEADER;
> 1571
> 1572 /* fall through */
> 1573
> 1574 case SP_STATE_READING_PROC_HEADER:
> 1575 __socket_proto_read (priv, ret);
> 1576
> 1577 gf_trace_add("xdrmem_create", default_read_size,
> (uintptr_t)proghdr_buf);
> 1578 /* there can be 'xdata' in read response, figure
> it out */
> 1579 xdrmem_create (&xdr, proghdr_buf, default_read_size,
> 1580 XDR_DECODE);
> 1581
> 1582 /* This will fail if there is xdata sent from
> server, if not,
> 1583 well and good, we don't need to worry about */
> 1584 xdr_gfs3_read_rsp (&xdr, &read_rsp);
> 1585
> 1586 free (read_rsp.xdata.xdata_val);
> 1587
> 1588 /* need to round off to proper roof (%4), as XDR
> packing pads
> 1589 the end of opaque object with '0' */
> 1590 size = roof (read_rsp.xdata.xdata_len, 4);
> 1591
> 1592 if (!size) {
> 1593 frag->call_body.reply.accepted_success_state
> 1594 = SP_STATE_READ_PROC_OPAQUE;
> 1595 goto read_proc_opaque;
> 1596 }
> 1597
> 1598 __socket_proto_init_pending (priv, size);
> 1599
> 1600 frag->call_body.reply.accepted_success_state
> 1601 = SP_STATE_READING_PROC_OPAQUE;
>
> The main problem here is that we are using two local variables
> (proghdr_buf and default_read_size) in two distinct states that might be
> called at different times.
>
> The particular case that is failing is the following:
>
> 1. In state SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT, everything is prepared
> to read 116 bytes. default_read_size is set to 116 and proghdr_buf
> points to the buffer where data will be written.
>
> 2. In state SP_STATE_READING_PROC_HEADER, a partial read of 88 bytes is
> done. At this point the function returns and proghdr_buf and
> default_read_size are lost.
>
> 3. When more data is available, this function is called again and it
> starts executing at state SP_STATE_READING_PROC_HEADER.
>
> 4. The remaining 28 bytes are read.
>
> 5. When it checks the buffer and tries to decode it to see if there's
> xdata present, it uses the default values of proghdr_buf and
> default_read_size, that are 0. This causes the decode to leave
> read_rsp.xdata.xdata_len set to 0.
>
> 6. The program interprets that xdata_len being 0 means that there's no
> xdata, so it continues reading the remaining of the RPC packet into the
> payload buffer.
>
> If you want, I can send a patch for this.
Yes. That would be helpful. The analysis is correct and moving initialization of prog_hdrbuf to line 1578 will fix the issue. If you are too busy, please let me know and I can patch it up too :).
Thanks for debugging the issue :).
regards,
Raghavendra.
>
> Xavi
>
> On 05/05/16 10:21, Xavier Hernandez wrote:
> > I've undone all changes and now I'm unable to reproduce the problem, so
> > the modification I did is probably incorrect and not the root cause, as
> > you described.
> >
> > I'll continue investigating...
> >
> > Xavi
> >
> > On 04/05/16 15:01, Xavier Hernandez wrote:
> >> On 04/05/16 14:47, Raghavendra Gowdappa wrote:
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
> >>>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> >>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> >>>> Sent: Wednesday, May 4, 2016 5:37:56 PM
> >>>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
> >>>>
> >>>> I think I've found the problem.
> >>>>
> >>>> 1567 case SP_STATE_READING_PROC_HEADER:
> >>>> 1568 __socket_proto_read (priv, ret);
> >>>> 1569
> >>>> 1570 /* there can be 'xdata' in read response, figure
> >>>> it out */
> >>>> 1571 xdrmem_create (&xdr, proghdr_buf,
> >>>> default_read_size,
> >>>> 1572 XDR_DECODE);
> >>>> 1573
> >>>> 1574 /* This will fail if there is xdata sent from
> >>>> server, if not,
> >>>> 1575 well and good, we don't need to worry
> >>>> about */
> >>>> 1576 xdr_gfs3_read_rsp (&xdr, &read_rsp);
> >>>> 1577
> >>>> 1578 free (read_rsp.xdata.xdata_val);
> >>>> 1579
> >>>> 1580 /* need to round off to proper roof (%4), as XDR
> >>>> packing pads
> >>>> 1581 the end of opaque object with '0' */
> >>>> 1582 size = roof (read_rsp.xdata.xdata_len, 4);
> >>>> 1583
> >>>> 1584 if (!size) {
> >>>> 1585
> >>>> frag->call_body.reply.accepted_success_state
> >>>> 1586 = SP_STATE_READ_PROC_OPAQUE;
> >>>> 1587 goto read_proc_opaque;
> >>>> 1588 }
> >>>> 1589
> >>>> 1590 __socket_proto_init_pending (priv, size);
> >>>> 1591
> >>>> 1592 frag->call_body.reply.accepted_success_state
> >>>> 1593 = SP_STATE_READING_PROC_OPAQUE;
> >>>> 1594
> >>>> 1595 case SP_STATE_READING_PROC_OPAQUE:
> >>>> 1596 __socket_proto_read (priv, ret);
> >>>> 1597
> >>>> 1598 frag->call_body.reply.accepted_success_state
> >>>> 1599 = SP_STATE_READ_PROC_OPAQUE;
> >>>>
> >>>> On line 1568 we read, at most, 116 bytes because we calculate the size
> >>>> of a read response without xdata. Then we detect that we really need
> >>>> more data for xdata (BTW, read_rsp.xdata.xdata_val will be always
> >>>> allocated even if xdr_gfs3_read_rsp() fails ?)
> >>>
> >>> No. It need not be. Its guaranteed that only on a successful
> >>> completion it is allocated. However, _if_ decoding fails only because
> >>> xdr stream doesn't include xdata bits, but xdata_len is zero (by
> >>> initializing it to default_read_size), then xdr library would've
> >>> filled read_rsp.xdata.xdata_len (read_rsp.xdata.xdata_val can still be
> >>> NULL).
> >>
> >> The question is: is it guaranteed that after an unsuccessful completion
> >> xdata_val will be NULL (i.e. not touched by the function, even if
> >> xadata_len is != 0) ? otherwise the free() could corrupt memory.
> >>
> >>>
> >>>>
> >>>> So we get into line 1596 with the pending info initialized to read the
> >>>> remaining data. This is the __socket_proto_read macro:
> >>>>
> >>>> 166 /* This will be used in a switch case and breaks from the switch
> >>>> case if all
> >>>> 167 * the pending data is not read.
> >>>> 168 */
> >>>> 169 #define __socket_proto_read(priv, ret)
> >>>> \
> >>>> 170 {
> >>>> \
> >>>> 171 size_t bytes_read = 0;
> >>>> \
> >>>> 172 struct gf_sock_incoming *in;
> >>>> \
> >>>> 173 in = &priv->incoming;
> >>>> \
> >>>> 174
> >>>> \
> >>>> 175 __socket_proto_update_pending (priv);
> >>>> \
> >>>> 176
> >>>> \
> >>>> 177 ret = __socket_readv (this,
> >>>> \
> >>>> 178 in->pending_vector, 1,
> >>>> \
> >>>> 179 &in->pending_vector,
> >>>> \
> >>>> 180 &in->pending_count,
> >>>> \
> >>>> 181 &bytes_read);
> >>>> \
> >>>> 182 if (ret == -1)
> >>>> \
> >>>> 183 break;
> >>>> \
> >>>> 184 __socket_proto_update_priv_after_read (priv,
> >>>> ret,
> >>>> bytes_read); \
> >>>> 185 }
> >>>>
> >>>> We read from the socket using __socket_readv(). It it fails, we quit.
> >>>> However if the socket doesn't have more data to read, this function
> >>>> does
> >>>> not return -1:
> >>>>
> >>>> 555 ret = __socket_cached_read (this,
> >>>> opvector, opcount);
> >>>> 556
> >>>> 557 if (ret == 0) {
> >>>> 558
> >>>> gf_log(this->name,GF_LOG_DEBUG,"EOF on socket");
> >>>> 559 errno = ENODATA;
> >>>> 560 ret = -1;
> >>>> 561 }
> >>>> 562 if (ret == -1 && errno == EAGAIN) {
> >>>> 563 /* done for now */
> >>>> 564 break;
> >>>> 565 }
> >>>> 566 this->total_bytes_read += ret;
> >>>>
> >>>> If __socket_cached_read() fails with errno == EAGAIN, we break and
> >>>> return opcount, which is >= 0. Causing the process to continue instead
> >>>> of waiting for more data.
> >>>
> >>> No. If you observe, there is a call to another macro
> >>> __socket_proto_update_priv_after_read at line 184. As can be seen in
> >>> the definition below, if (ret > 0) - which is the case in case of
> >>> short-read - we do break. I hope I am not missing anything here :).
> >>>
> >>> #define __socket_proto_update_priv_after_read(priv, ret,
> >>> bytes_read) \
> >>>
> >>> { \
> >>> struct gf_sock_incoming_frag *frag; \
> >>> frag =
> >>> &priv->incoming.frag; \
> >>>
> >>> \
> >>> frag->fragcurrent +=
> >>> bytes_read; \
> >>> frag->bytes_read +=
> >>> bytes_read; \
> >>>
> >>> \
> >>> if ((ret > 0) || (frag->remaining_size != 0))
> >>> { \
> >>> if (frag->remaining_size != 0 && ret == 0) { \
> >>> __socket_proto_reset_pending (priv); \
> >>>
> >>> } \
> >>>
> >>> \
> >>> gf_log (this->name,
> >>> GF_LOG_TRACE, \
> >>> "partial read on non-blocking
> >>> socket"); \
> >>>
> >>> \
> >>>
> >>> break; \
> >>>
> >>> } \
> >>> }
> >>>
> >>
> >> I didn't see this. However the fact is that doing the change I've
> >> described, it seems to work.
> >>
> >> I'll analyze it a bit more.
> >>
> >> Xavi
> >>
> >>>>
> >>>> As a side note, there's another problem here: if errno is not EAGAIN,
> >>>> we'll update this->total_bytes_read substracting one. This shouldn't be
> >>>> done when ret < 0.
> >>>>
> >>>> There are other places where ret is set to -1, but opcount is returned.
> >>>> I guess that we should also set opcount = -1 on these places, but I
> >>>> don't have a deep knowledge about this implementation.
> >>>>
> >>>> I've done a quick test checking for (ret != 0) instead of (ret ==
> >>>> -1) in
> >>>> __socket_proto_read() and it seemed to work.
> >>>>
> >>>> Could anyone with more knowledge about the communications layer verify
> >>>> this and explain what would be the best solution ?
> >>>>
> >>>> Xavi
> >>>>
> >>>> On 29/04/16 14:52, Xavier Hernandez wrote:
> >>>>> With your patch applied, it seems that the bug is not hit.
> >>>>>
> >>>>> I guess it's a timing issue that the new logging hides. Maybe no more
> >>>>> data available after reading the partial readv header ? (it will
> >>>>> arrive
> >>>>> later).
> >>>>>
> >>>>> I'll continue testing...
> >>>>>
> >>>>> Xavi
> >>>>>
> >>>>> On 29/04/16 13:48, Raghavendra Gowdappa wrote:
> >>>>>> Attaching the patch.
> >>>>>>
> >>>>>> ----- Original Message -----
> >>>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> >>>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
> >>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> >>>>>>> Sent: Friday, April 29, 2016 5:14:02 PM
> >>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
> >>>>>>> layer ?
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
> >>>>>>>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> >>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> >>>>>>>> Sent: Friday, April 29, 2016 1:21:57 PM
> >>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
> >>>>>>>> layer ?
> >>>>>>>>
> >>>>>>>> Hi Raghavendra,
> >>>>>>>>
> >>>>>>>> yes, the readv response contains xdata. The dict length is 38
> >>>>>>>> (0x26)
> >>>>>>>> and, at the moment of failure, rsp.xdata.xdata_len already contains
> >>>>>>>> 0x26.
> >>>>>>>
> >>>>>>> rsp.xdata.xdata_len having 0x26 even when decoding failed indicates
> >>>>>>> that the
> >>>>>>> approach used in socket.c to get the length of xdata is correct.
> >>>>>>> However, I
> >>>>>>> cannot find any other way of xdata going into payload vector other
> >>>>>>> than
> >>>>>>> xdata_len being zero. Just to be double sure, I've a patch
> >>>>>>> containing
> >>>>>>> debug
> >>>>>>> message printing xdata_len when decoding fails in socket.c. Can you
> >>>>>>> please
> >>>>>>> apply the patch, run the tests and revert back with results?
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Xavi
> >>>>>>>>
> >>>>>>>> On 29/04/16 09:10, Raghavendra Gowdappa wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ----- Original Message -----
> >>>>>>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> >>>>>>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
> >>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> >>>>>>>>>> Sent: Friday, April 29, 2016 12:36:43 PM
> >>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
> >>>>>>>>>> layer ?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> >>>>>>>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
> >>>>>>>>>>> Cc: "Jeff Darcy" <jdarcy at redhat.com>, "Gluster Devel"
> >>>>>>>>>>> <gluster-devel at gluster.org>
> >>>>>>>>>>> Sent: Friday, April 29, 2016 12:07:59 PM
> >>>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
> >>>>>>>>>>> layer ?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
> >>>>>>>>>>>> To: "Jeff Darcy" <jdarcy at redhat.com>
> >>>>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> >>>>>>>>>>>> Sent: Thursday, April 28, 2016 8:15:36 PM
> >>>>>>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications
> >>>>>>>>>>>> layer
> >>>>>>>>>>>> ?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Jeff,
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 28.04.2016 15:20, Jeff Darcy wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> This happens with Gluster 3.7.11 accessed through Ganesha and
> >>>>>>>>>>>> gfapi.
> >>>>>>>>>>>> The
> >>>>>>>>>>>> volume is a distributed-disperse 4*(4+2). I'm able to
> >>>>>>>>>>>> reproduce the
> >>>>>>>>>>>> problem
> >>>>>>>>>>>> easily doing the following test: iozone -t2 -s10g -r1024k
> >>>>>>>>>>>> -i0 -w
> >>>>>>>>>>>> -F/iozone{1..2}.dat echo 3 >/proc/sys/vm/drop_caches iozone -t2
> >>>>>>>>>>>> -s10g
> >>>>>>>>>>>> -r1024k -i1 -w -F/iozone{1..2}.dat The error happens soon after
> >>>>>>>>>>>> starting
> >>>>>>>>>>>> the
> >>>>>>>>>>>> read test. As can be seen in the data below,
> >>>>>>>>>>>> client3_3_readv_cbk() is
> >>>>>>>>>>>> processing an iovec of 116 bytes, however it should be of 154
> >>>>>>>>>>>> bytes
> >>>>>>>>>>>> (the
> >>>>>>>>>>>> buffer in memory really seems to contain 154 bytes). The
> >>>>>>>>>>>> data on
> >>>>>>>>>>>> the
> >>>>>>>>>>>> network
> >>>>>>>>>>>> seems ok (at least I haven't been able to identify any
> >>>>>>>>>>>> problem), so
> >>>>>>>>>>>> this
> >>>>>>>>>>>> must be a processing error on the client side. The last
> >>>>>>>>>>>> field in
> >>>>>>>>>>>> cut
> >>>>>>>>>>>> buffer
> >>>>>>>>>>>> of the sequentialized data corresponds to the length of the
> >>>>>>>>>>>> xdata
> >>>>>>>>>>>> field:
> >>>>>>>>>>>> 0x26. So at least 38 more byte should be present.
> >>>>>>>>>>>> Nice detective work, Xavi. It would be *very* interesting to
> >>>>>>>>>>>> see what
> >>>>>>>>>>>> the value of the "count" parameter is (it's unfortunately
> >>>>>>>>>>>> optimized
> >>>>>>>>>>>> out).
> >>>>>>>>>>>> I'll bet it's two, and iov[1].iov_len is 38. I have a weak
> >>>>>>>>>>>> memory of
> >>>>>>>>>>>> some problems with how this iov is put together, a couple of
> >>>>>>>>>>>> years
> >>>>>>>>>>>> ago,
> >>>>>>>>>>>> and it looks like you might have tripped over one more.
> >>>>>>>>>>>> It seems you are right. The count is 2 and the first 38
> >>>>>>>>>>>> bytes of
> >>>>>>>>>>>> the
> >>>>>>>>>>>> second
> >>>>>>>>>>>> vector contains the remaining data of xdata field.
> >>>>>>>>>>>
> >>>>>>>>>>> This is the bug. client3_3_readv_cbk (and for that matter all
> >>>>>>>>>>> the
> >>>>>>>>>>> actors/cbks) expects response in utmost two vectors:
> >>>>>>>>>>> 1. Program header containing request or response. This is
> >>>>>>>>>>> subjected to
> >>>>>>>>>>> decoding/encoding. This vector should point to a buffer that
> >>>>>>>>>>> contains
> >>>>>>>>>>> the
> >>>>>>>>>>> entire program header/response contiguously.
> >>>>>>>>>>> 2. If the procedure returns payload (like readv response or a
> >>>>>>>>>>> write
> >>>>>>>>>>> request),
> >>>>>>>>>>> second vector contains the buffer pointing to the entire
> >>>>>>>>>>> (contiguous)
> >>>>>>>>>>> payload. Note that this payload is raw and is not subjected to
> >>>>>>>>>>> encoding/decoding.
> >>>>>>>>>>>
> >>>>>>>>>>> In your case, this _clean_ separation is broken with part of
> >>>>>>>>>>> program
> >>>>>>>>>>> header
> >>>>>>>>>>> slipping into 2nd vector supposed to contain read data (may be
> >>>>>>>>>>> because
> >>>>>>>>>>> of
> >>>>>>>>>>> rpc fragmentation). I think this is a bug in socket layer. I'll
> >>>>>>>>>>> update
> >>>>>>>>>>> more
> >>>>>>>>>>> on this.
> >>>>>>>>>>
> >>>>>>>>>> Does your read response include xdata too? I think the code
> >>>>>>>>>> related to
> >>>>>>>>>> reading xdata in readv response is a bit murky.
> >>>>>>>>>>
> >>>>>>>>>> <socket.c/__socket_read_accepted_successful_reply>
> >>>>>>>>>>
> >>>>>>>>>> case SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT:
> >>>>>>>>>> default_read_size = xdr_sizeof ((xdrproc_t)
> >>>>>>>>>> xdr_gfs3_read_rsp,
> >>>>>>>>>> &read_rsp);
> >>>>>>>>>>
> >>>>>>>>>> proghdr_buf = frag->fragcurrent;
> >>>>>>>>>>
> >>>>>>>>>> __socket_proto_init_pending (priv,
> >>>>>>>>>> default_read_size);
> >>>>>>>>>>
> >>>>>>>>>> frag->call_body.reply.accepted_success_state
> >>>>>>>>>> = SP_STATE_READING_PROC_HEADER;
> >>>>>>>>>>
> >>>>>>>>>> /* fall through */
> >>>>>>>>>>
> >>>>>>>>>> case SP_STATE_READING_PROC_HEADER:
> >>>>>>>>>> __socket_proto_read (priv, ret);
> >>>>>>>>>>
> >>>>>>>>>>>>>>> By this time we've read read response _minus_ the xdata
> >>>>>>>>>
> >>>>>>>>> I meant we have read "readv response header"
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> /* there can be 'xdata' in read response, figure
> >>>>>>>>>> it out
> >>>>>>>>>> */
> >>>>>>>>>> xdrmem_create (&xdr, proghdr_buf,
> >>>>>>>>>> default_read_size,
> >>>>>>>>>> XDR_DECODE);
> >>>>>>>>>>
> >>>>>>>>>>>>>>>> We created xdr stream above with "default_read_size" (this
> >>>>>>>>>>>>>>>> doesn't
> >>>>>>>>>>>>>>>> include xdata)
> >>>>>>>>>>
> >>>>>>>>>> /* This will fail if there is xdata sent from
> >>>>>>>>>> server, if
> >>>>>>>>>> not,
> >>>>>>>>>> well and good, we don't need to worry
> >>>>>>>>>> about */
> >>>>>>>>>>
> >>>>>>>>>>>>>>>> what if xdata is present and decoding failed (as length
> >>>>>>>>>>>>>>>> of xdr
> >>>>>>>>>>>>>>>> stream
> >>>>>>>>>>>>>>>> above - default_read_size - doesn't include xdata)?
> >>>>>>>>>>>>>>>> would we
> >>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>> valid value in read_rsp.xdata.xdata_len? This is the part
> >>>>>>>>>>>>>>>> I am
> >>>>>>>>>>>>>>>> confused about. If read_rsp.xdata.xdata_len is not
> >>>>>>>>>>>>>>>> correct then
> >>>>>>>>>>>>>>>> there
> >>>>>>>>>>>>>>>> is a possibility that xdata might not be entirely
> >>>>>>>>>>>>>>>> present in
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> vector socket passes to higher layers as progheader (with
> >>>>>>>>>>>>>>>> part or
> >>>>>>>>>>>>>>>> entire xdata spilling over to payload vector).
> >>>>>>>>>>
> >>>>>>>>>> xdr_gfs3_read_rsp (&xdr, &read_rsp);
> >>>>>>>>>>
> >>>>>>>>>> free (read_rsp.xdata.xdata_val);
> >>>>>>>>>>
> >>>>>>>>>> /* need to round off to proper roof (%4), as XDR
> >>>>>>>>>> packing
> >>>>>>>>>> pads
> >>>>>>>>>> the end of opaque object with '0' */
> >>>>>>>>>> size = roof (read_rsp.xdata.xdata_len, 4);
> >>>>>>>>>>
> >>>>>>>>>> if (!size) {
> >>>>>>>>>>
> >>>>>>>>>> frag->call_body.reply.accepted_success_state
> >>>>>>>>>> = SP_STATE_READ_PROC_OPAQUE;
> >>>>>>>>>> goto read_proc_opaque;
> >>>>>>>>>> }
> >>>>>>>>>>
> >>>>>>>>>> </socket.c>
> >>>>>>>>>>
> >>>>>>>>>> Can you please confirm whether there was an xdata in the readv
> >>>>>>>>>> response
> >>>>>>>>>> (may
> >>>>>>>>>> be by looking in bricks) whose decoding failed?
> >>>>>>>>>>
> >>>>>>>>>> regards,
> >>>>>>>>>> Raghavendra
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Gluster-devel mailing list
> >>>>>>>>>> Gluster-devel at gluster.org
> >>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>> _______________________________________________
> >>>>> Gluster-devel mailing list
> >>>>> Gluster-devel at gluster.org
> >>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
> >>>>
> >> _______________________________________________
> >> Gluster-devel mailing list
> >> Gluster-devel at gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-devel
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list