[Gluster-devel] Possible bug in the communications layer ?

Raghavendra Gowdappa rgowdapp at redhat.com
Fri Apr 29 11:48:47 UTC 2016


Attaching the patch.

----- Original Message -----
> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> To: "Xavier Hernandez" <xhernandez at datalab.es>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Friday, April 29, 2016 5:14:02 PM
> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
> 
> 
> 
> ----- Original Message -----
> > From: "Xavier Hernandez" <xhernandez at datalab.es>
> > To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > Sent: Friday, April 29, 2016 1:21:57 PM
> > Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
> > 
> > Hi Raghavendra,
> > 
> > yes, the readv response contains xdata. The dict length is 38 (0x26)
> > and, at the moment of failure, rsp.xdata.xdata_len already contains 0x26.
> 
> rsp.xdata.xdata_len having 0x26 even when decoding failed indicates that the
> approach used in socket.c to get the length of xdata is correct. However, I
> cannot find any other way of xdata going into payload vector other than
> xdata_len being zero. Just to be double sure, I've a patch containing debug
> message printing xdata_len when decoding fails in socket.c. Can you please
> apply the patch, run the tests and revert back with results?
> 
> > 
> > Xavi
> > 
> > On 29/04/16 09:10, Raghavendra Gowdappa wrote:
> > >
> > >
> > > ----- Original Message -----
> > >> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> > >> To: "Xavier Hernandez" <xhernandez at datalab.es>
> > >> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > >> Sent: Friday, April 29, 2016 12:36:43 PM
> > >> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
> > >>
> > >>
> > >>
> > >> ----- Original Message -----
> > >>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> > >>> To: "Xavier Hernandez" <xhernandez at datalab.es>
> > >>> Cc: "Jeff Darcy" <jdarcy at redhat.com>, "Gluster Devel"
> > >>> <gluster-devel at gluster.org>
> > >>> Sent: Friday, April 29, 2016 12:07:59 PM
> > >>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
> > >>>
> > >>>
> > >>>
> > >>> ----- Original Message -----
> > >>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
> > >>>> To: "Jeff Darcy" <jdarcy at redhat.com>
> > >>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > >>>> Sent: Thursday, April 28, 2016 8:15:36 PM
> > >>>> Subject: Re: [Gluster-devel] Possible bug in the communications layer
> > >>>> ?
> > >>>>
> > >>>>
> > >>>>
> > >>>> Hi Jeff,
> > >>>>
> > >>>> On 28.04.2016 15:20, Jeff Darcy wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> This happens with Gluster 3.7.11 accessed through Ganesha and gfapi.
> > >>>> The
> > >>>> volume is a distributed-disperse 4*(4+2). I'm able to reproduce the
> > >>>> problem
> > >>>> easily doing the following test: iozone -t2 -s10g -r1024k -i0 -w
> > >>>> -F/iozone{1..2}.dat echo 3 >/proc/sys/vm/drop_caches iozone -t2 -s10g
> > >>>> -r1024k -i1 -w -F/iozone{1..2}.dat The error happens soon after
> > >>>> starting
> > >>>> the
> > >>>> read test. As can be seen in the data below, client3_3_readv_cbk() is
> > >>>> processing an iovec of 116 bytes, however it should be of 154 bytes
> > >>>> (the
> > >>>> buffer in memory really seems to contain 154 bytes). The data on the
> > >>>> network
> > >>>> seems ok (at least I haven't been able to identify any problem), so
> > >>>> this
> > >>>> must be a processing error on the client side. The last field in cut
> > >>>> buffer
> > >>>> of the sequentialized data corresponds to the length of the xdata
> > >>>> field:
> > >>>> 0x26. So at least 38 more byte should be present.
> > >>>> Nice detective work, Xavi.  It would be *very* interesting to see what
> > >>>> the value of the "count" parameter is (it's unfortunately optimized
> > >>>> out).
> > >>>> I'll bet it's two, and iov[1].iov_len is 38.  I have a weak memory of
> > >>>> some problems with how this iov is put together, a couple of years
> > >>>> ago,
> > >>>> and it looks like you might have tripped over one more.
> > >>>> It seems you are right. The count is 2 and the first 38 bytes of the
> > >>>> second
> > >>>> vector contains the remaining data of xdata field.
> > >>>
> > >>> This is the bug. client3_3_readv_cbk (and for that matter all the
> > >>> actors/cbks) expects response in utmost two vectors:
> > >>> 1. Program header containing request or response. This is subjected to
> > >>> decoding/encoding. This vector should point to a buffer that contains
> > >>> the
> > >>> entire program header/response contiguously.
> > >>> 2. If the procedure returns payload (like readv response or a write
> > >>> request),
> > >>> second vector contains the buffer pointing to the entire (contiguous)
> > >>> payload. Note that this payload is raw and is not subjected to
> > >>> encoding/decoding.
> > >>>
> > >>> In your case, this _clean_ separation is broken with part of program
> > >>> header
> > >>> slipping into 2nd vector supposed to contain read data (may be because
> > >>> of
> > >>> rpc fragmentation). I think this is a bug in socket layer. I'll update
> > >>> more
> > >>> on this.
> > >>
> > >> Does your read response include xdata too? I think the code related to
> > >> reading xdata in readv response is a bit murky.
> > >>
> > >> <socket.c/__socket_read_accepted_successful_reply>
> > >>
> > >>         case SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT:
> > >>                 default_read_size = xdr_sizeof ((xdrproc_t)
> > >>                 xdr_gfs3_read_rsp,
> > >>                                                 &read_rsp);
> > >>
> > >>                 proghdr_buf = frag->fragcurrent;
> > >>
> > >>                 __socket_proto_init_pending (priv, default_read_size);
> > >>
> > >>                 frag->call_body.reply.accepted_success_state
> > >>                         = SP_STATE_READING_PROC_HEADER;
> > >>
> > >>                 /* fall through */
> > >>
> > >>         case SP_STATE_READING_PROC_HEADER:
> > >>                 __socket_proto_read (priv, ret);
> > >>
> > >>>>>>> By this time we've read read response _minus_ the xdata
> > >
> > > I meant we have read "readv response header"
> > >
> > >>
> > >>                 /* there can be 'xdata' in read response, figure it out
> > >>                 */
> > >>                 xdrmem_create (&xdr, proghdr_buf, default_read_size,
> > >>                                XDR_DECODE);
> > >>
> > >>>>>>>> We created xdr stream above with "default_read_size" (this doesn't
> > >>>>>>>> include xdata)
> > >>
> > >>                 /* This will fail if there is xdata sent from server, if
> > >>                 not,
> > >>                    well and good, we don't need to worry about  */
> > >>
> > >>>>>>>> what if xdata is present and decoding failed (as length of xdr
> > >>>>>>>> stream
> > >>>>>>>> above - default_read_size - doesn't include xdata)? would we have
> > >>>>>>>> a
> > >>>>>>>> valid value in read_rsp.xdata.xdata_len? This is the part I am
> > >>>>>>>> confused about. If read_rsp.xdata.xdata_len is not correct then
> > >>>>>>>> there
> > >>>>>>>> is a possibility that xdata might not be entirely present in the
> > >>>>>>>> vector socket passes to higher layers as progheader (with part or
> > >>>>>>>> entire xdata spilling over to payload vector).
> > >>
> > >>                 xdr_gfs3_read_rsp (&xdr, &read_rsp);
> > >>
> > >>                 free (read_rsp.xdata.xdata_val);
> > >>
> > >>                 /* need to round off to proper roof (%4), as XDR packing
> > >>                 pads
> > >>                    the end of opaque object with '0' */
> > >>                 size = roof (read_rsp.xdata.xdata_len, 4);
> > >>
> > >>                 if (!size) {
> > >>                         frag->call_body.reply.accepted_success_state
> > >>                                 = SP_STATE_READ_PROC_OPAQUE;
> > >>                         goto read_proc_opaque;
> > >>                 }
> > >>
> > >> </socket.c>
> > >>
> > >> Can you please confirm whether there was an xdata in the readv response
> > >> (may
> > >> be by looking in bricks) whose decoding failed?
> > >>
> > >> regards,
> > >> Raghavendra
> > >> _______________________________________________
> > >> Gluster-devel mailing list
> > >> Gluster-devel at gluster.org
> > >> http://www.gluster.org/mailman/listinfo/gluster-devel
> > >>
> > 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: read-xdata.diff
Type: text/x-patch
Size: 1485 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160429/0a8eb82f/attachment-0001.bin>


More information about the Gluster-devel mailing list