[Bugs] [Bug 1484709] [geo-rep+qr]: Crashes observed at slave from qr_lookup_sbk during rename/hardlink/ rebalance cases
bugzilla at redhat.com
bugzilla at redhat.com
Tue Aug 29 14:48:50 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1484709
Nithya Balachandran <nbalacha at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|quick-read |distribute
Assignee|nbalacha at redhat.com |bugs at gluster.org
--- Comment #4 from Nithya Balachandran <nbalacha at redhat.com> ---
RCA:
(gdb) bt
#0 0x00007f2152cebaeb in __memcpy_sse2 () from /lib64/libc.so.6
#1 0x00007f2145b9b824 in memcpy (__len=<optimized out>, __src=<optimized out>,
__dest=<optimized out>) at /usr/include/bits/string3.h:51
#2 qr_content_extract (xdata=xdata at entry=0x7f21340658c0) at quick-read.c:279
#3 0x00007f2145b9bdaa in qr_lookup_cbk (frame=0x7f212c05fff0,
cookie=<optimized out>, this=0x7f214001e4b0, op_ret=0, op_errno=117,
inode_ret=0x7f212c05bb60,
buf=0x7f212c002b58, xdata=0x7f21340658c0, postparent=0x7f212c002d88) at
quick-read.c:423
#4 0x00007f2145da7de7 in ioc_lookup_cbk (frame=0x7f212c057f00,
cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>,
op_errno=<optimized out>,
inode=0x7f212c05bb60, stbuf=0x7f212c002b58, xdata=0x7f21340658c0,
postparent=0x7f212c002d88) at io-cache.c:267
#5 0x00007f2146614627 in dht_discover_complete
(this=this at entry=0x7f2140016f40,
discover_frame=discover_frame at entry=0x7f212c061500) at dht-common.c:577
#6 0x00007f214661534a in dht_discover_cbk (frame=0x7f212c061500,
cookie=0x7f21400135b0, this=0x7f2140016f40, op_ret=<optimized out>,
op_errno=117,
inode=0x7f212c05bb60, stbuf=0x7f212c05e648, xattr=0x7f21340658c0,
postparent=0x7f212c05e6b8) at dht-common.c:700
#7 0x00007f21468c4d10 in afr_discover_done (this=<optimized out>,
frame=0x7f212c0546c0) at afr-common.c:2624
#8 afr_discover_cbk (frame=frame at entry=0x7f212c0546c0, cookie=<optimized out>,
this=<optimized out>, op_ret=<optimized out>, op_errno=<optimized out>,
inode=inode at entry=0x7f212c05bb60, buf=buf at entry=0x7f2147537920,
xdata=0x7f21400adca0, postparent=postparent at entry=0x7f2147537990) at
afr-common.c:2669
#9 0x00007f2146b012c7 in client3_3_lookup_cbk (req=<optimized out>,
iov=<optimized out>, count=<optimized out>, myframe=0x7f212c061380)
at client-rpc-fops.c:2947
#10 0x00007f21543f2840 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x7f2140080400, pollin=pollin at entry=0x7f21400ee830) at
rpc-clnt.c:794
#11 0x00007f21543f2b27 in rpc_clnt_notify (trans=<optimized out>,
mydata=0x7f2140080430, event=<optimized out>, data=0x7f21400ee830) at
rpc-clnt.c:987
#12 0x00007f21543ee9e3 in rpc_transport_notify (this=this at entry=0x7f21400805a0,
event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, data=data at entry=0x7f21400ee830)
at rpc-transport.c:538
#13 0x00007f2148fef3d6 in socket_event_poll_in (this=this at entry=0x7f21400805a0,
notify_handled=<optimized out>) at socket.c:2306
#14 0x00007f2148ff197c in socket_event_handler (fd=13, idx=3, gen=1,
data=0x7f21400805a0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2458
#15 0x00007f21546841e6 in event_dispatch_epoll_handler (event=0x7f2147537e80,
event_pool=0x56504ff3cee0) at event-epoll.c:572
#16 event_dispatch_epoll_worker (data=0x56504ff95d60) at event-epoll.c:648
#17 0x00007f2153488e25 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f2152d5534d in clone () from /lib64/libc.so.6
(gdb) f 6
#6 0x00007f214661534a in dht_discover_cbk (frame=0x7f212c061500,
cookie=0x7f21400135b0, this=0x7f2140016f40, op_ret=<optimized out>,
op_errno=117,
inode=0x7f212c05bb60, stbuf=0x7f212c05e648, xattr=0x7f21340658c0,
postparent=0x7f212c05e6b8) at dht-common.c:700
700 dht_discover_complete (this, frame);
(gdb) l
695 * which calls STACK_DESTROY (frame). In the case of
"attempt_unwind",
696 * this makes sure that the thread don't call dht_frame_return,
till
697 * call to dht_discover_complete is done.
698 */
699 if (attempt_unwind) {
700 dht_discover_complete (this, frame); <--- Here
701 }
702
703 this_call_cnt = dht_frame_return (frame);
704
dht_discover_cbk has found a file so it tries to unwind before all subvols have
responded. So it is possible that other subvols have not yet responded.
Checking the frame for dht_discover_cbk:
gdb) f 6
#6 0x00007f214661534a in dht_discover_cbk (frame=0x7f212c061500,
cookie=0x7f21400135b0, this=0x7f2140016f40, op_ret=<optimized out>,
op_errno=117,
inode=0x7f212c05bb60, stbuf=0x7f212c05e648, xattr=0x7f21340658c0,
postparent=0x7f212c05e6b8) at dht-common.c:700
700 dht_discover_complete (this, frame);
(gdb) p local->file_count
$6 = 3
Strangely a file with this gfid was found on _all_ 3 children. As
local->file_count doe not distinguish between data files or a linkto files,it
seems like there was a stale (?) linkto file on at least one subvol.
So at this point all 3 subvols have at least unwound to dht_discover_cbk.
(gdb) p local->dir_count
$7 = 0
(gdb) p is_linkfile
$8 = <optimized out>
(gdb) p local->cached_subvol
$9 = (xlator_t *) 0x7f2140014d20
(gdb) p local->hashed_subvol
$10 = (xlator_t *) 0x0
(gdb) p *local->cached_subvol
$11 = {name = 0x7f21400158c0 "Slave-replicate-1", type = 0x7f2140015a40
"cluster/replicate", instance_name = 0x0, next = 0x7f21400135b0, prev =
0x7f2140015e30,
parents = 0x7f2140018630, children = 0x7f2140015d10, options =
0x7f2140015910, dlhandle = 0x7f2140014320, fops = 0x7f2146ad91a0 <fops>,
cbks = 0x7f2146ad90e0 <cbks>, dumpops = 0x7f2146ad9140 <dumpops>,
volume_options = {next = 0x7f2140015b40, prev = 0x7f2140015b40},
fini = 0x7f21468c6ac0 <fini>, init = 0x7f21468c5e10 <init>, reconfigure =
0x7f21468c5740 <reconfigure>, mem_acct_init = 0x7f21468c5670 <mem_acct_init>,
notify = 0x7f21468c5610 <notify>, loglevel = GF_LOG_NONE, latencies = {{min =
0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 55 times>},
history = 0x0, ctx = 0x56504ff06010, graph = 0x7f2140003f30, itable =
0x7f214005e0a0, init_succeeded = 1 '\001', private = 0x7f214005db00,
mem_acct = 0x7f214005a0b0, winds = 0, switched = 0 '\000', local_pool =
0x7f21548f2ab8 <pools+280>, is_autoloaded = _gf_false, volfile_id = 0x0}
(gdb) p *prev
$12 = {name = 0x7f2140012740 "Slave-replicate-0", type = 0x7f2140014260
"cluster/replicate", instance_name = 0x0, next = 0x7f21400118a0, prev =
0x7f2140014d20,
parents = 0x7f2140018480, children = 0x7f2140014c00, options =
0x7f2140014130, dlhandle = 0x7f2140014320, fops = 0x7f2146ad91a0 <fops>,
cbks = 0x7f2146ad90e0 <cbks>, dumpops = 0x7f2146ad9140 <dumpops>,
volume_options = {next = 0x7f2140014a30, prev = 0x7f2140014a30},
fini = 0x7f21468c6ac0 <fini>, init = 0x7f21468c5e10 <init>, reconfigure =
0x7f21468c5740 <reconfigure>, mem_acct_init = 0x7f21468c5670 <mem_acct_init>,
notify = 0x7f21468c5610 <notify>, loglevel = GF_LOG_NONE, latencies = {{min =
0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 55 times>},
history = 0x0, ctx = 0x56504ff06010, graph = 0x7f2140003f30, itable =
0x7f2140065080, init_succeeded = 1 '\001', private = 0x7f2140064ae0,
mem_acct = 0x7f2140061090, winds = 0, switched = 0 '\000', local_pool =
0x7f21548f2ab8 <pools+280>, is_autoloaded = _gf_false, volfile_id = 0x0}
As the local->cached subvol does not match prev, it is clear that
local->cached_subvol was updated after this thread proceeded to unwind to the
higher xlators. This means a data file was found on at least 2 subvols - this
could be because a linkto file was in the process of being created when the
lookup hit the brick.
In dht_discover_complete () :
local = discover_frame->local;
layout = local->layout;
conf = this->private;
LOCK(&discover_frame->lock);
{
main_frame = local->main_frame;
local->main_frame = NULL;
}
UNLOCK(&discover_frame->lock);
if (!main_frame)
return 0; <--- So far so good - only one thread will unwind
here.
However, dicover_frame->local is valid and _still_ accessible to other threads
unwinding to dht_discover_cbk.
In dht_discover_cbk:
...
local = frame->local;
...
LOCK (&frame->lock);
{
...
if (is_dir) {
local->dir_count ++;
} else {
local->file_count ++;
if (!is_linkfile) {
/* real file */
local->cached_subvol = prev;
attempt_unwind = 1;
} else {
goto unlock;
}
}
...
if (local->xattr == NULL) {
local->xattr = dict_ref (xattr); <-- say thread 1 (the
one that crashed )did this
} else {
dht_aggregate_xattr (local->xattr, xattr); <--- and
thread 2 called this (this should _only_ be done for directories.)
}
In dht_aggregate_xattr () -> dht_aggregate ():
ret = dict_set (dst, key, value);
if (ret) {
gf_msg ("dht", GF_LOG_WARNING, 0,
DHT_MSG_DICT_SET_FAILED,
"Failed to set dictionary value: key = %s",
key);
}
dict_set _replaces_ the key which unrefs the old data, causing its ref-count to
become 0 and hence destroying it.
So here is probably what happened:
Time: subvol-0 subvol-1
t0 dht_discover_cbk
local->xattr = dict_ref (dict0);
dict_set (dict0, GF_CONTENT_KEY, data0)
attempt_unwind = 1
t1 dht_discover_complete
t2 qr_content_extract gets data0
t3 dht_discover_cbk
dht_aggregate_xattr
(local->xattr, xattr)
|->dict_set (dict0, GF_CONTENT_KEY, data1)
|-><-- data_unref (data0) <-- As ref-count = 0,
data0 is destroyed
attempt_unwind = 1
t3 qr_content_extract accesses data0 and crashes
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list