[Gluster-devel] Possible bug in the communications layer ?

Fri Apr 29 12:52:41 UTC 2016

With your patch applied, it seems that the bug is not hit.

I guess it's a timing issue that the new logging hides. Maybe no more 
data available after reading the partial readv header ? (it will arrive 
later).

I'll continue testing...

Xavi

On 29/04/16 13:48, Raghavendra Gowdappa wrote:
> Attaching the patch.
>
> ----- Original Message -----
>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>> Sent: Friday, April 29, 2016 5:14:02 PM
>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
>>
>>
>>
>> ----- Original Message -----
>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
>>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>> Sent: Friday, April 29, 2016 1:21:57 PM
>>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
>>>
>>> Hi Raghavendra,
>>>
>>> yes, the readv response contains xdata. The dict length is 38 (0x26)
>>> and, at the moment of failure, rsp.xdata.xdata_len already contains 0x26.
>>
>> rsp.xdata.xdata_len having 0x26 even when decoding failed indicates that the
>> approach used in socket.c to get the length of xdata is correct. However, I
>> cannot find any other way of xdata going into payload vector other than
>> xdata_len being zero. Just to be double sure, I've a patch containing debug
>> message printing xdata_len when decoding fails in socket.c. Can you please
>> apply the patch, run the tests and revert back with results?
>>
>>>
>>> Xavi
>>>
>>> On 29/04/16 09:10, Raghavendra Gowdappa wrote:
>>>>
>>>>
>>>> ----- Original Message -----
>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>>> Sent: Friday, April 29, 2016 12:36:43 PM
>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>> Cc: "Jeff Darcy" <jdarcy at redhat.com>, "Gluster Devel"
>>>>>> <gluster-devel at gluster.org>
>>>>>> Sent: Friday, April 29, 2016 12:07:59 PM
>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications layer ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es>
>>>>>>> To: "Jeff Darcy" <jdarcy at redhat.com>
>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>>>>> Sent: Thursday, April 28, 2016 8:15:36 PM
>>>>>>> Subject: Re: [Gluster-devel] Possible bug in the communications layer
>>>>>>> ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Jeff,
>>>>>>>
>>>>>>> On 28.04.2016 15:20, Jeff Darcy wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This happens with Gluster 3.7.11 accessed through Ganesha and gfapi.
>>>>>>> The
>>>>>>> volume is a distributed-disperse 4*(4+2). I'm able to reproduce the
>>>>>>> problem
>>>>>>> easily doing the following test: iozone -t2 -s10g -r1024k -i0 -w
>>>>>>> -F/iozone{1..2}.dat echo 3 >/proc/sys/vm/drop_caches iozone -t2 -s10g
>>>>>>> -r1024k -i1 -w -F/iozone{1..2}.dat The error happens soon after
>>>>>>> starting
>>>>>>> the
>>>>>>> read test. As can be seen in the data below, client3_3_readv_cbk() is
>>>>>>> processing an iovec of 116 bytes, however it should be of 154 bytes
>>>>>>> (the
>>>>>>> buffer in memory really seems to contain 154 bytes). The data on the
>>>>>>> network
>>>>>>> seems ok (at least I haven't been able to identify any problem), so
>>>>>>> this
>>>>>>> must be a processing error on the client side. The last field in cut
>>>>>>> buffer
>>>>>>> of the sequentialized data corresponds to the length of the xdata
>>>>>>> field:
>>>>>>> 0x26. So at least 38 more byte should be present.
>>>>>>> Nice detective work, Xavi.  It would be *very* interesting to see what
>>>>>>> the value of the "count" parameter is (it's unfortunately optimized
>>>>>>> out).
>>>>>>> I'll bet it's two, and iov[1].iov_len is 38.  I have a weak memory of
>>>>>>> some problems with how this iov is put together, a couple of years
>>>>>>> ago,
>>>>>>> and it looks like you might have tripped over one more.
>>>>>>> It seems you are right. The count is 2 and the first 38 bytes of the
>>>>>>> second
>>>>>>> vector contains the remaining data of xdata field.
>>>>>>
>>>>>> This is the bug. client3_3_readv_cbk (and for that matter all the
>>>>>> actors/cbks) expects response in utmost two vectors:
>>>>>> 1. Program header containing request or response. This is subjected to
>>>>>> decoding/encoding. This vector should point to a buffer that contains
>>>>>> the
>>>>>> entire program header/response contiguously.
>>>>>> 2. If the procedure returns payload (like readv response or a write
>>>>>> request),
>>>>>> second vector contains the buffer pointing to the entire (contiguous)
>>>>>> payload. Note that this payload is raw and is not subjected to
>>>>>> encoding/decoding.
>>>>>>
>>>>>> In your case, this _clean_ separation is broken with part of program
>>>>>> header
>>>>>> slipping into 2nd vector supposed to contain read data (may be because
>>>>>> of
>>>>>> rpc fragmentation). I think this is a bug in socket layer. I'll update
>>>>>> more
>>>>>> on this.
>>>>>
>>>>> Does your read response include xdata too? I think the code related to
>>>>> reading xdata in readv response is a bit murky.
>>>>>
>>>>> <socket.c/__socket_read_accepted_successful_reply>
>>>>>
>>>>>         case SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT:
>>>>>                 default_read_size = xdr_sizeof ((xdrproc_t)
>>>>>                 xdr_gfs3_read_rsp,
>>>>>                                                 &read_rsp);
>>>>>
>>>>>                 proghdr_buf = frag->fragcurrent;
>>>>>
>>>>>                 __socket_proto_init_pending (priv, default_read_size);
>>>>>
>>>>>                 frag->call_body.reply.accepted_success_state
>>>>>                         = SP_STATE_READING_PROC_HEADER;
>>>>>
>>>>>                 /* fall through */
>>>>>
>>>>>         case SP_STATE_READING_PROC_HEADER:
>>>>>                 __socket_proto_read (priv, ret);
>>>>>
>>>>>>>>>> By this time we've read read response _minus_ the xdata
>>>>
>>>> I meant we have read "readv response header"
>>>>
>>>>>
>>>>>                 /* there can be 'xdata' in read response, figure it out
>>>>>                 */
>>>>>                 xdrmem_create (&xdr, proghdr_buf, default_read_size,
>>>>>                                XDR_DECODE);
>>>>>
>>>>>>>>>>> We created xdr stream above with "default_read_size" (this doesn't
>>>>>>>>>>> include xdata)
>>>>>
>>>>>                 /* This will fail if there is xdata sent from server, if
>>>>>                 not,
>>>>>                    well and good, we don't need to worry about  */
>>>>>
>>>>>>>>>>> what if xdata is present and decoding failed (as length of xdr
>>>>>>>>>>> stream
>>>>>>>>>>> above - default_read_size - doesn't include xdata)? would we have
>>>>>>>>>>> a
>>>>>>>>>>> valid value in read_rsp.xdata.xdata_len? This is the part I am
>>>>>>>>>>> confused about. If read_rsp.xdata.xdata_len is not correct then
>>>>>>>>>>> there
>>>>>>>>>>> is a possibility that xdata might not be entirely present in the
>>>>>>>>>>> vector socket passes to higher layers as progheader (with part or
>>>>>>>>>>> entire xdata spilling over to payload vector).
>>>>>
>>>>>                 xdr_gfs3_read_rsp (&xdr, &read_rsp);
>>>>>
>>>>>                 free (read_rsp.xdata.xdata_val);
>>>>>
>>>>>                 /* need to round off to proper roof (%4), as XDR packing
>>>>>                 pads
>>>>>                    the end of opaque object with '0' */
>>>>>                 size = roof (read_rsp.xdata.xdata_len, 4);
>>>>>
>>>>>                 if (!size) {
>>>>>                         frag->call_body.reply.accepted_success_state
>>>>>                                 = SP_STATE_READ_PROC_OPAQUE;
>>>>>                         goto read_proc_opaque;
>>>>>                 }
>>>>>
>>>>> </socket.c>
>>>>>
>>>>> Can you please confirm whether there was an xdata in the readv response
>>>>> (may
>>>>> be by looking in bricks) whose decoding failed?
>>>>>
>>>>> regards,
>>>>> Raghavendra
>>>>> _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>
>>