[Gluster-devel] in dict.c, this gets replace by environment

Emmanuel Dreyfus manu at netbsd.org
Tue Aug 19 08:47:26 UTC 2014


On Wed, Aug 13, 2014 at 09:45:30PM -0700, Anand Avati wrote:
> Can you post a bt full?

The previously posted backtrace was not helpful since it was the place
where it crashed, which hapened long after the corruption. Adding a lot
of log mesages, I can way that the corruption always happens whether
I use volum start or volume stop, but it only crashes sometimes in volume
stop. 

Using gdb and a watchpoint, I found the place where it gets overwritten.
The bad news is that the only explanation for the overrun strdup is 
a heap corruption (I checked the copied string was indeed nul-terminated)

Breakpoint 1, cli_quotad_clnt_rpc_init () at cli.c:550
550             rpc = cli_quotad_clnt_init (THIS, rpc_opts);
(gdb) print rpc_opts
$1 = (dict_t *) 0xbb140628
(gdb) watch *0xbb140628
(gdb) c
Continuing.
Watchpoint 2: *0xbb140628

Old value = 0
New value = 1045712469
0xbb3b4ff4 in memcpy () from /usr/lib/libc.so.12
(gdb) bt
#0  0xbb3b4ff4 in memcpy () from /usr/lib/libc.so.12
#1  0xbb39f182 in strdup () from /usr/lib/libc.so.12
#2  0x08051201 in cli_cmd_tokens_fill (tokens=0xbb1980e0, 
    template=0x80907d4 "volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT>] [disperse [<COUNT>]] [redundancy <COUNT>] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK>... [force]") at registry.c:207
#3  0x080512fd in cli_cmd_tokenize (
    template=0x80907d4 "volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT>] [disperse [<COUNT>]] [redundancy <COUNT>] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK>... [force]") at registry.c:243
#4  0x08051619 in cli_cmd_register (tree=0xbf7feb98, cmd=0x809e554)
    at registry.c:390
#5  0x08057c3b in cli_cmd_volume_register (state=0xbf7feb88)
    at cli-cmd-volume.c:2502
#6  0x08051d5e in cli_cmds_register (state=0xbf7feb88) at cli-cmd.c:218
#7  0x08050df8 in main (argc=5, argv=0xbf7fec48) at cli.c:709

Another approach is to link with libefence. Here it crashes really 
reliably, but without a hint about where the problem really is:

#0  memalign (alignment=4, userSize=120) at efence.c:509
#1  0xbb3d71d0 in malloc (size=120) at efence.c:833
#2  0xbb3d7738 in calloc (nelem=1, elsize=120) at efence.c:848
#3  0xbb799294 in __gf_calloc (nmemb=1, size=88, type=48, 
    typestr=0xbb7d8dcd "gf_common_mt_mem_pool") at mem-pool.c:114
#4  0xbb799cd0 in mem_get (mem_pool=0xbb239fa4) at mem-pool.c:434
#5  0xbb799b24 in mem_get0 (mem_pool=0xbb239fa4) at mem-pool.c:368
#6  0xbb75a500 in get_new_dict_full (size_hint=1) at dict.c:50
#7  0xbb75a60b in dict_new () at dict.c:101
#8  0x080508d1 in cli_quotad_clnt_rpc_init () at cli.c:531
#9  0x08050def in main (argc=4, argv=0xbf7fec58) at cli.c:705




-- 
Emmanuel Dreyfus
manu at netbsd.org


More information about the Gluster-devel mailing list