[Gluster-devel] NetBSD regressions, memory corruption
Venky Shankar
yknev.shankar at gmail.com
Wed Mar 25 09:51:28 UTC 2015
looks like the iobref (and the iobuf) was allocated in protocol/server..
(gdb) x/16x (ie->ie_iobref->iobrefs - 8)
0xbb11a438: 0xbb18ba80 0x00000001 0x00000068 0x00000040
0xbb11a448: 0xbb1e2018 0xcafebabe 0x00000000 0x00000000
0xbb11a458: 0x00000003 0x00000003 0x00000008 0x00000003
0xbb11a468: 0x0000000c 0x00000003 0x0000000e 0x00000003
8 bytes before the magic header (0xcafebabe) lives the xlator ("this")
that invoked GF_MALLOC. Here it's:
(gdb) p *(xlator_t *)0xbb1e2018
$9 = {name = 0xbb1dbb08 "patchy-server", type = 0xbb1dbb38
"protocol/server", next = 0xbb1e1018, prev = 0x0, parents = 0x0,
children = 0xbb1dbbc8, options = 0xbb18a028, dlhandle = 0xb9b7d000,
fops = 0xb9adf0e0 <fops>, cbks = 0xb9adc8cc <cbks>,
dumpops = 0xb9ade460 <dumpops>, volume_options = {next = 0xbb1dbb68,
prev = 0xbb1dbbf8}, fini = 0xb9ab539d <fini>,
init = 0xb9ab48a5 <init>, reconfigure = 0xb9ab418c <reconfigure>,
mem_acct_init = 0xb9ab3cb1 <mem_acct_init>,
notify = 0xb9ab53a3 <notify>, loglevel = GF_LOG_NONE, latencies =
{{min = 0, max = 0, total = 0, std = 0, mean = 0,
count = 0} <repeats 50 times>}, history = 0x0, ctx = 0xbb109000,
graph = 0xbb1c30f8, itable = 0x0,
init_succeeded = 1 '\001', private = 0xbb1e3018, mem_acct =
{num_types = 144, rec = 0xbb1c6000}, winds = 0,
switched = 0 '\000', local_pool = 0x0, is_autoloaded = _gf_false}
looking into it more. if the above strikes a bell to someone, let us know.
-venky
On Tue, Mar 24, 2015 at 11:28 PM, Niels de Vos <ndevos at redhat.com> wrote:
> On Tue, Mar 24, 2015 at 05:18:44PM +0000, Emmanuel Dreyfus wrote:
>> Hi
>>
>> The merge of http://review.gluster.org/9953/ removed a few crashes from
>> NetBSD regression tests, but the thing remains uterly broken since the
>> merge of http://review.gluster.org/9708/ though I cannot tell if I have
>> bugs leftover form this commit or if I face new problems.
>>
>> Here are the known problem so far:
>
> ...snip! I'll only give some info to your 2nd point.
>
>> 2) I still experience memory corruption, which usually crash glsuterfsd
>> because some pointer waas replaced by value 0x3. This strikes on iobref
>> most of the time, but it can happens elsewhere.
>>
>> I would be glad if someone could help here. On nbslave70:/autobuild I
>> added code to check for iobref/iobuf sanity at random place (by calling
>> iobref_sanity()). I do this in synask_wrap and in STACK_WIND/UNWIND,
>> but I have not been able to spot the source of the problem yet.
>>
>> The weird thing is that memory seems to always be overwritten by the
>> same values, and magic 0xcafebabe number before the buffer is preserved.
>> Here is an example: where iobref->iobrefs = 0xbb11a458
>> 0xbb11a44c: 0xcafebabe 0x00000000 0x00000000 0x00000003
>> 0xbb11a45c: 0x00000003 0x00000008 0x00000003 0x0000000c
>> 0xbb11a46c: 0x00000003 0x0000000e 0x00000003 0x00000010
>> 0xbb11a47c: 0x00000003 0x00000009 0x00000003 0x0000000d
>> 0xbb11a48c: 0x00000003 0x00000015 0x00000003 0x00000016
>> 0xbb11a49c: 0x00000003 0x00000032 0x00000034 0xbb1e2018
>> 0xbb11a4ac: 0xcafebabe 0x00000000 0x00000000 0xbb11a5d8
>
> Recently I was looking into something that involved some more
> understanding of GF_MALLOC(). I did not really continue with it becase
> other things got a higher priority. But, maybe this layout helps you a
> little:
>
> : :
> : :
> +----------------------+
> | GF_MEM_TRAILER_MAGIC |
> +----------------------+
> | |
> | ... |
> | |
> +----------------------+
> | 8 bytes |
> +----------------------+
> | GF_MEM_HEADER_MAGIC |
> +----------------------+
> | *xlator_t |
> +----------------------+
> | size |
> +----------------------+
> | type |
> +----------------------+
> : :
> : :
>
> #define GF_MEM_HEADER_MAGIC 0xCAFEBABE
> #define GF_MEM_TRAILER_MAGIC 0xBAADF00D
>
>
> Because there is no 0xbaadfood in your memory dump, I would assume that
> the memory has just been allocated, and the 0xcafebabe at 0xbb11a4ac is
> a left over from a previous allocation.
>
> You could try to run a test with more strict memory enforcing. All the
> GF_ASSERT() calls will actually call abort() in that case, and it may
> make things a little easier to debug. You would pass --enable-debug to
> the configure commandline:
>
> $ ./configure --enable-debug
>
> I hope that we will be able to setup scheduled automated regression
> tests with --enable-debug build binaries. It may be helpful to catch
> unintended NULL usage a little earlier.
>
> HTH,
> Niels
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list