[Gluster-devel] Possible bug in the communications layer ?

Raghavendra Gowdappa rgowdapp at redhat.com
Fri Apr 29 07:06:43 UTC 2016



----- Original Message -----
> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> To: "Xavier Hernandez" <xhernandez at datalab.es>
> Cc: "Jeff Darcy" <jdarcy at redhat.com>, "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Friday, April 29, 2016 12:07:59 PM
> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
> 
> 
> 
> ----- Original Message -----
> > From: "Xavier Hernandez" <xhernandez at datalab.es>
> > To: "Jeff Darcy" <jdarcy at redhat.com>
> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > Sent: Thursday, April 28, 2016 8:15:36 PM
> > Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
> > 
> > 
> > 
> > Hi Jeff,
> > 
> > On 28.04.2016 15:20, Jeff Darcy wrote:
> > 
> > 
> > 
> > This happens with Gluster 3.7.11 accessed through Ganesha and gfapi. The
> > volume is a distributed-disperse 4*(4+2). I'm able to reproduce the problem
> > easily doing the following test: iozone -t2 -s10g -r1024k -i0 -w
> > -F/iozone{1..2}.dat echo 3 >/proc/sys/vm/drop_caches iozone -t2 -s10g
> > -r1024k -i1 -w -F/iozone{1..2}.dat The error happens soon after starting
> > the
> > read test. As can be seen in the data below, client3_3_readv_cbk() is
> > processing an iovec of 116 bytes, however it should be of 154 bytes (the
> > buffer in memory really seems to contain 154 bytes). The data on the
> > network
> > seems ok (at least I haven't been able to identify any problem), so this
> > must be a processing error on the client side. The last field in cut buffer
> > of the sequentialized data corresponds to the length of the xdata field:
> > 0x26. So at least 38 more byte should be present.
> > Nice detective work, Xavi.  It would be *very* interesting to see what
> > the value of the "count" parameter is (it's unfortunately optimized out).
> > I'll bet it's two, and iov[1].iov_len is 38.  I have a weak memory of
> > some problems with how this iov is put together, a couple of years ago,
> > and it looks like you might have tripped over one more.
> > It seems you are right. The count is 2 and the first 38 bytes of the second
> > vector contains the remaining data of xdata field.
> 
> This is the bug. client3_3_readv_cbk (and for that matter all the
> actors/cbks) expects response in utmost two vectors:
> 1. Program header containing request or response. This is subjected to
> decoding/encoding. This vector should point to a buffer that contains the
> entire program header/response contiguously.
> 2. If the procedure returns payload (like readv response or a write request),
> second vector contains the buffer pointing to the entire (contiguous)
> payload. Note that this payload is raw and is not subjected to
> encoding/decoding.
> 
> In your case, this _clean_ separation is broken with part of program header
> slipping into 2nd vector supposed to contain read data (may be because of
> rpc fragmentation). I think this is a bug in socket layer. I'll update more
> on this.

Does your read response include xdata too? I think the code related to reading xdata in readv response is a bit murky.

<socket.c/__socket_read_accepted_successful_reply>

        case SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT:
                default_read_size = xdr_sizeof ((xdrproc_t) xdr_gfs3_read_rsp,
                                                &read_rsp);

                proghdr_buf = frag->fragcurrent;

                __socket_proto_init_pending (priv, default_read_size);

                frag->call_body.reply.accepted_success_state
                        = SP_STATE_READING_PROC_HEADER;

                /* fall through */

        case SP_STATE_READING_PROC_HEADER:
                __socket_proto_read (priv, ret);

>>>>> By this time we've read read response _minus_ the xdata

                /* there can be 'xdata' in read response, figure it out */
                xdrmem_create (&xdr, proghdr_buf, default_read_size,
                               XDR_DECODE);

>>>>>> We created xdr stream above with "default_read_size" (this doesn't include xdata)

                /* This will fail if there is xdata sent from server, if not,                                                                                           
                   well and good, we don't need to worry about  */

>>>>>> what if xdata is present and decoding failed (as length of xdr stream above - default_read_size - doesn't include xdata)? would we have a valid value in read_rsp.xdata.xdata_len? This is the part I am confused about. If read_rsp.xdata.xdata_len is not correct then there is a possibility that xdata might not be entirely present in the vector socket passes to higher layers as progheader (with part or entire xdata spilling over to payload vector).

                xdr_gfs3_read_rsp (&xdr, &read_rsp);

                free (read_rsp.xdata.xdata_val);

                /* need to round off to proper roof (%4), as XDR packing pads                                                                                           
                   the end of opaque object with '0' */
                size = roof (read_rsp.xdata.xdata_len, 4);

                if (!size) {
                        frag->call_body.reply.accepted_success_state
                                = SP_STATE_READ_PROC_OPAQUE;
                        goto read_proc_opaque;
                }

</socket.c>

Can you please confirm whether there was an xdata in the readv response (may be by looking in bricks) whose decoding failed?

regards,
Raghavendra


More information about the Gluster-devel mailing list