[Gluster-devel] NetBSD regressions, memory corruption

Venky Shankar yknev.shankar at gmail.com
Wed Mar 25 09:51:28 UTC 2015


looks like the iobref (and the iobuf) was allocated in protocol/server..

(gdb) x/16x (ie->ie_iobref->iobrefs - 8)
0xbb11a438:     0xbb18ba80      0x00000001      0x00000068      0x00000040
0xbb11a448:     0xbb1e2018      0xcafebabe      0x00000000      0x00000000
0xbb11a458:     0x00000003      0x00000003      0x00000008      0x00000003
0xbb11a468:     0x0000000c      0x00000003      0x0000000e      0x00000003

8 bytes before the magic header (0xcafebabe) lives the xlator ("this")
that invoked GF_MALLOC. Here it's:

(gdb) p *(xlator_t *)0xbb1e2018
$9 = {name = 0xbb1dbb08 "patchy-server", type = 0xbb1dbb38
"protocol/server", next = 0xbb1e1018, prev = 0x0, parents = 0x0,
  children = 0xbb1dbbc8, options = 0xbb18a028, dlhandle = 0xb9b7d000,
fops = 0xb9adf0e0 <fops>, cbks = 0xb9adc8cc <cbks>,
  dumpops = 0xb9ade460 <dumpops>, volume_options = {next = 0xbb1dbb68,
prev = 0xbb1dbbf8}, fini = 0xb9ab539d <fini>,
  init = 0xb9ab48a5 <init>, reconfigure = 0xb9ab418c <reconfigure>,
mem_acct_init = 0xb9ab3cb1 <mem_acct_init>,
  notify = 0xb9ab53a3 <notify>, loglevel = GF_LOG_NONE, latencies =
{{min = 0, max = 0, total = 0, std = 0, mean = 0,
      count = 0} <repeats 50 times>}, history = 0x0, ctx = 0xbb109000,
graph = 0xbb1c30f8, itable = 0x0,
  init_succeeded = 1 '\001', private = 0xbb1e3018, mem_acct =
{num_types = 144, rec = 0xbb1c6000}, winds = 0,
  switched = 0 '\000', local_pool = 0x0, is_autoloaded = _gf_false}

looking into it more. if the above strikes a bell to someone, let us know.

-venky

On Tue, Mar 24, 2015 at 11:28 PM, Niels de Vos <ndevos at redhat.com> wrote:
> On Tue, Mar 24, 2015 at 05:18:44PM +0000, Emmanuel Dreyfus wrote:
>> Hi
>>
>> The merge of http://review.gluster.org/9953/ removed a few crashes from
>> NetBSD regression tests, but the thing remains uterly broken since the
>> merge of http://review.gluster.org/9708/ though I cannot tell if I have
>> bugs leftover form this commit or if I face new problems.
>>
>> Here are the known problem so far:
>
> ...snip! I'll only give some info to your 2nd point.
>
>> 2) I still experience memory corruption, which usually crash glsuterfsd
>> because some pointer waas replaced by value 0x3. This strikes on iobref
>> most of the time, but it can happens elsewhere.
>>
>> I would be glad if someone could help here. On nbslave70:/autobuild I
>> added code to check for iobref/iobuf sanity at random place (by calling
>> iobref_sanity()). I do this in synask_wrap and in STACK_WIND/UNWIND,
>> but I have not been able to spot the source of the problem yet.
>>
>> The weird thing is that memory seems to always be overwritten by the
>> same values, and magic 0xcafebabe number before the buffer is preserved.
>> Here is an example: where iobref->iobrefs = 0xbb11a458
>> 0xbb11a44c:     0xcafebabe      0x00000000      0x00000000      0x00000003
>> 0xbb11a45c:     0x00000003      0x00000008      0x00000003      0x0000000c
>> 0xbb11a46c:     0x00000003      0x0000000e      0x00000003      0x00000010
>> 0xbb11a47c:     0x00000003      0x00000009      0x00000003      0x0000000d
>> 0xbb11a48c:     0x00000003      0x00000015      0x00000003      0x00000016
>> 0xbb11a49c:     0x00000003      0x00000032      0x00000034      0xbb1e2018
>> 0xbb11a4ac:     0xcafebabe      0x00000000      0x00000000      0xbb11a5d8
>
> Recently I was looking into something that involved some more
> understanding of GF_MALLOC(). I did not really continue with it becase
> other things got a higher priority. But, maybe this layout helps you a
> little:
>
>      :                      :
>      :                      :
>      +----------------------+
>      | GF_MEM_TRAILER_MAGIC |
>      +----------------------+
>      |                      |
>      |         ...          |
>      |                      |
>      +----------------------+
>      |       8 bytes        |
>      +----------------------+
>      | GF_MEM_HEADER_MAGIC  |
>      +----------------------+
>      |      *xlator_t       |
>      +----------------------+
>      |        size          |
>      +----------------------+
>      |        type          |
>      +----------------------+
>      :                      :
>      :                      :
>
>      #define GF_MEM_HEADER_MAGIC  0xCAFEBABE
>      #define GF_MEM_TRAILER_MAGIC 0xBAADF00D
>
>
> Because there is no 0xbaadfood in your memory dump, I would assume that
> the memory has just been allocated, and the 0xcafebabe at 0xbb11a4ac is
> a left over from a previous allocation.
>
> You could try to run a test with more strict memory enforcing. All the
> GF_ASSERT() calls will actually call abort() in that case, and it may
> make things a little easier to debug. You would pass --enable-debug to
> the configure commandline:
>
>     $ ./configure --enable-debug
>
> I hope that we will be able to setup scheduled automated regression
> tests with --enable-debug build binaries. It may be helpful to catch
> unintended NULL usage a little earlier.
>
> HTH,
> Niels
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list