[Gluster-devel] Possible bug in the communications layer ?

Jeff Darcy jdarcy at redhat.com
Thu Apr 28 13:20:38 UTC 2016


> This happens with Gluster 3.7.11 accessed through Ganesha and gfapi. The
> volume is a distributed-disperse 4*(4+2).
> 
> I'm able to reproduce the problem easily doing the following test:
> 
> iozone -t2 -s10g -r1024k -i0 -w -F <nfs mount>/iozone{1..2}.dat
> echo 3 >/proc/sys/vm/drop_caches
> iozone -t2 -s10g -r1024k -i1 -w -F <nfs mount>/iozone{1..2}.dat
> 
> The error happens soon after starting the read test.
> 
> As can be seen in the data below, client3_3_readv_cbk() is processing an
> iovec of 116 bytes, however it should be of 154 bytes (the buffer in
> memory really seems to contain 154 bytes). The data on the network seems
> ok (at least I haven't been able to identify any problem), so this must
> be a processing error on the client side.
> 
> The last field in cut buffer of the sequentialized data corresponds to
> the length of the xdata field: 0x26. So at least 38 more byte should be
> present.


Nice detective work, Xavi.  It would be *very* interesting to see what
the value of the "count" parameter is (it's unfortunately optimized out).
I'll bet it's two, and iov[1].iov_len is 38.  I have a weak memory of
some problems with how this iov is put together, a couple of years ago,
and it looks like you might have tripped over one more.

<evil> Maybe it's related to all that epoll stuff. </evil>


More information about the Gluster-devel mailing list