[Bugs] [Bug 1486134] glusterfsd (brick) process crashed

bugzilla at redhat.com bugzilla at redhat.com
Tue Aug 29 07:07:11 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1486134



--- Comment #2 from Raghavendra G <rgowdapp at redhat.com> ---
Here is the verbatim posting of a mail from Nithya related to this issue:

<mail>

    Hi,


    I am not sure how correct this is but I tried some stuff with objdump and
the stacktrace in the BZ. Here is what I have:

    /lib64/libc.so.6(+0x2e1f2)[0x7fac3d50e1f2]
    /lib64/libglusterfs.so.0(+0x8a072)[0x7fac3ee8c072]
    /usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0x3814)[0x7fac39fbc814]
    /usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0xba18)[0x7fac39fc4a18]


    Messing around with objdump and debuginfo I have come up with:

    /lib64/libc.so.6(best guess - assert)

    /lib64/libglusterfs.so.0(+0x8a072 = event_unregister_epoll_common + offset)

    static int                                                                  
    event_unregister_epoll_common (struct event_pool *event_pool, int fd,       
                                   int idx, int do_close)                       
    {                                                                           
    ...                                                                         
            slot = event_slot_get (event_pool, idx);                            

            assert (slot->fd == fd);                                            
       8a053:       48 8d 0d 86 09 02 00    lea    0x20986(%rip),%rcx        #
aa9e0 <_fini@@Base+0xbd9c>
       8a05a:       48 8d 35 d3 06 02 00    lea    0x206d3(%rip),%rsi        #
aa734 <_fini@@Base+0xbaf0>
       8a061:       48 8d 3d f2 06 02 00    lea    0x206f2(%rip),%rdi        #
aa75a <_fini@@Base+0xbb16>
       8a068:       ba 9d 01 00 00          mov    $0x19d,%edx                  
       8a06d:       e8 ee 0e f9 ff          callq  1af60 <__assert_fail at plt>    
       8a072:       0f 1f 40 00             nopl   0x0(%rax)                    
       8a076:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)         
       8a07d:       00 00 00  

    /usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0x3814 = __socket_reset
+ offset)

    static void                                                                 
    __socket_reset (rpc_transport_t *this)   

    ...

            memset (&priv->incoming, 0, sizeof (priv->incoming));               

            event_unregister_close (this->ctx->event_pool, priv->sock,
priv->idx);  
        37ff:       48 8b 45 60             mov    0x60(%rbp),%rax              
        3803:       8b 53 04                mov    0x4(%rbx),%edx               
        3806:       8b 33                   mov    (%rbx),%esi                  
        3808:       48 8b b8 b0 01 00 00    mov    0x1b0(%rax),%rdi             
        380f:       e8 3c f7 ff ff          callq  2f50
<event_unregister_close at plt>

            priv->sock = -1;                                                    
        3814:       c7 03 ff ff ff ff       movl   $0xffffffff,(%rbx)           



    /usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0xba18
=socket_event_handler + offset)                                                 

    static int                                                                  
    socket_event_handler (int fd, int idx, void *data,                          

                          int poll_in, int poll_out, int poll_err)  

    ...

                    __socket_ioq_flush (this);                                  
        ba08:       48 89 df                mov    %rbx,%rdi                    
        ba0b:       e8 b0 a2 ff ff          callq  5cc0
<socket_connect_error_cbk@@Base+0x2980>
                    __socket_reset (this);                                      
        ba10:       48 89 df                mov    %rbx,%rdi                    
        ba13:       e8 48 7d ff ff          callq  3760
<socket_connect_error_cbk@@Base+0x420>
            }                                                                   
            pthread_mutex_unlock (&priv->lock);    



    In event_unregister_epoll_common, the cause of the crash seems to be a call
to assert:

            slot = event_slot_get (event_pool, idx);                            

            assert (slot->fd == fd);  <---- here!!!


    So the stack would be 

    0 assert
    1 event_unregister_epoll_common
    2 event_unregister_close_epoll
    3  __socket_reset
    4  socket_event_handler

    Many thanks to Shyam for helping me with objdump.


    Hope this helps.

    Regards,
    Nithya

</mail>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list