[Gluster-users] glusterfs crash on frame->next->prev

chyd chyd at ihep.ac.cn
Mon Apr 9 04:36:58 UTC 2012


Hi all,

These days, our glusterfs system crashed many times. The coredump showed the crash point always in FRAME_DESTORY or STACK_WIND. Once frame->next->prev was referenced, even if the value check such as 'if (frame->next->prev == NULL) {return}' always crashed. This mainly occured when opendir was called. There are about 500 bricks in thes system. The coredump listed as follows, and anybody can help me ? 
Coredump 1:

#0  0x00002aad0cd465f4 in FRAME_DESTROY (frame=<value optimized out>, cookie=<value optimized out>, 
    this=<value optimized out>, op_ret=<value optimized out>, op_errno=<value optimized out>, fd=<value optimized out>)
    at ../../../../libglusterfs/src/stack.h:143
143                     frame->next->prev = frame->prev;
(gdb) bt
#0  0x00002aad0cd465f4 in FRAME_DESTROY (frame=<value optimized out>, cookie=<value optimized out>, 
    this=<value optimized out>, op_ret=<value optimized out>, op_errno=<value optimized out>, fd=<value optimized out>)
    at ../../../../libglusterfs/src/stack.h:143
#1  STACK_DESTROY (frame=<value optimized out>, cookie=<value optimized out>, this=<value optimized out>, 
    op_ret=<value optimized out>, op_errno=<value optimized out>, fd=<value optimized out>)
    at ../../../../libglusterfs/src/stack.h:180
#2  fuse_fd_cbk (frame=<value optimized out>, cookie=<value optimized out>, this=<value optimized out>, 
    op_ret=<value optimized out>, op_errno=<value optimized out>, fd=<value optimized out>) at fuse-bridge.c:594
#3  0x00002aad0e34e7a9 in io_stats_opendir_cbk (frame=0x2aadc5b329c0, cookie=<value optimized out>, 
    this=<value optimized out>, op_ret=0, op_errno=117, fd=0x2aad67f98954) at io-stats.c:1492
#4  0x00002aad0e12eab2 in sp_fd_cbk (frame=0x2aadc5bc0fc0, cookie=<value optimized out>, this=<value optimized out>, 
    op_ret=0, op_errno=117, fd=0x2aad67f98954) at stat-prefetch.c:1506
#5  0x00002aad0df00238 in dht_fd_cbk (frame=0x2aadc5bc10c0, cookie=<value optimized out>, this=<value optimized out>, 
    op_ret=<value optimized out>, op_errno=<value optimized out>, fd=<value optimized out>) at dht-common.c:2615
#6  0x00002aad0dc8f941 in afr_examine_dir_readdir_cbk (frame=0x2aadc3a59ce0, cookie=<value optimized out>, 
    this=<value optimized out>, op_ret=<value optimized out>, op_errno=<value optimized out>, entries=<value optimized out>)
    at afr-dir-read.c:185
#7  0x00002aad0da6485d in client3_1_readdir_cbk (req=<value optimized out>, iov=<value optimized out>, 
    count=<value optimized out>, myframe=0x36a4600) at client3_1-fops.c:1883
#8  0x00002aad0a317315 in rpc_clnt_handle_reply (clnt=0x18a2b30, pollin=0x24054b0) at rpc-clnt.c:741
#9  0x00002aad0a317569 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x18a2b60, event=<value optimized out>, 
    data=<value optimized out>) at rpc-clnt.c:854
#10 0x00002aad0a312418 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, 
    data=<value optimized out>) at rpc-transport.c:919
#11 0x00002aad0d41f254 in socket_event_poll_in (this=0x18a2c50) at socket.c:1647
#12 0x00002aad0d41f337 in socket_event_handler (fd=<value optimized out>, idx=405, data=0x18a2c50, poll_in=1, poll_out=0, 
    poll_err=<value optimized out>) at socket.c:1762
#13 0x00002aad0a0e3014 in event_dispatch_epoll_handler (event_pool=0x1602330) at event.c:794
#14 event_dispatch_epoll (event_pool=0x1602330) at event.c:856
#15 0x0000000000405e69 in main (argc=5, argv=0x7fff4b224d28) at glusterfsd.c:1462

Coredump 2:
#0  CHECK_FRAME (frame=0x2af752e9bba0, this=<value optimized out>, loc=0x2af7538acd38, fd=0x2af6b71f6b4c)
    at ../../../../libglusterfs/src/stack.h:205
205                     if  (frame->root->frames.next->prev == NULL){
(gdb) bt
#0  CHECK_FRAME (frame=0x2af752e9bba0, this=<value optimized out>, loc=0x2af7538acd38, fd=0x2af6b71f6b4c) 
    at ../../../../libglusterfs/src/stack.h:205
#1  afr_opendir (frame=0x2af752e9bba0, this=<value optimized out>, loc=0x2af7538acd38, fd=0x2af6b71f6b4c)
    at afr-dir-read.c:343
#2  0x00002af65d2d036a in dht_opendir (frame=0x2af751f91ec0, this=<value optimized out>, loc=0x2af7538acd38, 
    fd=0x2af6b71f6b4c) at dht-common.c:3092
#3  0x00002af65d4ff86a in sp_opendir (frame=<value optimized out>, this=0xa1a730, loc=0x2af7538acd38, fd=0x2af6b71f6b4c)
    at stat-prefetch.c:1854
#4  0x00002af65d719c4a in io_stats_opendir (frame=<value optimized out>, this=0xa1b940, loc=0x2af7538acd38, 
    fd=0x2af6b71f6b4c) at io-stats.c:2137
#5  0x00002af65c111540 in fuse_opendir_resume (state=0x2af7538acd20) at fuse-bridge.c:2011
#6  0x00002af65c0ff512 in fuse_resolve_and_resume (state=0x2af7538acd20, fn=0x2af65c1113a0 <fuse_opendir_resume>)
    at fuse-resolve.c:763
#7  0x00002af65c10c5bd in fuse_thread_proc (data=0x7d16d0) at fuse-bridge.c:3223
#8  0x0000003c394077e1 in start_thread () from /lib64/libpthread.so.0
#9  0x0000003c390e152d in clone () from /lib64/libc.so.6

Note: CHECK_FRAME was a function I added to showed the crashed line in the coredump.
Thank you very much
2012-04-09
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120409/4a050452/attachment.html>


More information about the Gluster-users mailing list