[Gluster-devel] regression: brick crashed because of changelog xlator init failure

Kotresh Hiremath Ravishankar khiremat at redhat.com
Sat May 9 10:53:01 UTC 2015


Hi,

There are few regression failures with changelog translator init being failed and a core is generated
as explained below.

1. Why changelog translator init failed?
   
    In snapshot test cases, virtual multiple peers are setup in single node,
    which causes 'Address already in use' and 'port already in use' error. Hence
    changelog translator failed.

2. Even if changelog translator failed it should not core why is the core?

   Well, the stack trace in regression run didn't help much.
   I induced the error manually in local system and could trace in gdb
   and is happening as below.

   There is some memory corruption in cleanup_and_exit path when translators are failed.
   I suppose this could happen for any translator init failed and not only specific to 
   changelog. Could some look into this?

#0  0x00007ffff6cb67e0 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00007ffff7b70db5 in __gf_free (free_ptr=0x7fffe4031700) at mem-pool.c:303
#2  0x00007ffff7b7120c in mem_put (ptr=0x7fffe403171c) at mem-pool.c:570
#3  0x00007ffff7b43fb4 in log_buf_destroy (buf=buf at entry=0x7fffe403171c) at logging.c:357
#4  0x00007ffff7b47001 in gf_log_flush_list (copy=copy at entry=0x7fffeb80aa50, ctx=ctx at entry=0x614010) at logging.c:1711
#5  0x00007ffff7b4720d in gf_log_flush_extra_msgs (new=0, ctx=0x614010) at logging.c:1777
#6  gf_log_set_log_buf_size (buf_size=buf_size at entry=0) at logging.c:270
#7  0x00007ffff7b47267 in gf_log_disable_suppression_before_exit (ctx=0x614010) at logging.c:437
#8  0x00000000004080ec in cleanup_and_exit (signum=signum at entry=0) at glusterfsd.c:1217
#9  0x0000000000408a16 in glusterfs_process_volfp (ctx=ctx at entry=0x614010, fp=fp at entry=0x7fffe40014f0) at glusterfsd.c:2183
#10 0x000000000040ccf7 in mgmt_getspec_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7fffe4000fa4) at glusterfsd-mgmt.c:1560
#11 0x00007ffff7915c70 in rpc_clnt_handle_reply (clnt=clnt at entry=0x66d280, pollin=pollin at entry=0x7fffe4002540) at rpc-clnt.c:766
#12 0x00007ffff7915ee4 in rpc_clnt_notify (trans=<optimized out>, mydata=0x66d2b0, event=<optimized out>, data=0x7fffe4002540) at rpc-clnt.c:894
#13 0x00007ffff79121f3 in rpc_transport_notify (this=this at entry=0x66d6f0, event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, data=data at entry=0x7fffe4002540)
    at rpc-transport.c:543
#14 0x00007fffed2ca1f4 in socket_event_poll_in (this=this at entry=0x66d6f0) at socket.c:2290
#15 0x00007fffed2ccfb4 in socket_event_handler (fd=fd at entry=8, idx=idx at entry=1, data=0x66d6f0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2403
#16 0x00007ffff7b9aaba in event_dispatch_epoll_handler (event=0x7fffeb80ae90, event_pool=0x632c80) at event-epoll.c:572
#17 event_dispatch_epoll_worker (data=0x66e8b0) at event-epoll.c:674
#18 0x00007ffff6cb1ee5 in start_thread () from /lib64/libpthread.so.0
#19 0x00007ffff65f8b8d in clone () from /lib64/libc.so.6

Thanks and Regards,
Kotresh H R

----- Original Message -----
> From: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> To: "Vijay Bellur" <vbellur at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Saturday, May 9, 2015 1:06:07 PM
> Subject: Re: [Gluster-devel] regression: brick crashed because of changelog xlator init failure
> 
> It is crashing in libgcc!!!
> 
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007ff5555a1867 in ?? () from ./lib64/libgcc_s.so.1
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.12-1.149.el6_6.7.x86_64 keyutils-libs-1.4-5.el6.x86_64
> krb5-libs-1.10.3-37.el6_6.x86_64 libcom_err-1.41.12-21.el6.x86_64
> libgcc-4.4.7-11.el6.x86_64 libselinux-2.0.94-5.8.el6.x86_64
> openssl-1.0.1e-30.el6.8.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00007ff5555a1867 in ?? () from ./lib64/libgcc_s.so.1
> #1  0x00007ff5555a2119 in _Unwind_Backtrace () from ./lib64/libgcc_s.so.1
> #2  0x00007ff56170b8f6 in backtrace () from ./lib64/libc.so.6
> #3  0x00007ff562826544 in _gf_msg_backtrace_nomem (level=GF_LOG_ALERT,
> stacksize=200)
>     at
>     /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/logging.c:1097
> #4  0x00007ff562845b82 in gf_print_trace (signum=11, ctx=0xabc010)
>     at
>     /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/common-utils.c:618
> #5  0x0000000000409646 in glusterfsd_print_trace (signum=11) at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd.c:2007
> #6  <signal handler called>
> #7  0x00007ff554484fa9 in ?? ()
> #8  0x00007ff561d8b9d1 in start_thread () from ./lib64/libpthread.so.0
> #9  0x00007ff5616f58fd in clone () from ./lib64/libc.so.6
> 
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> > From: "Vijay Bellur" <vbellur at redhat.com>
> > To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>, "Pranith Kumar
> > Karampuri" <pkarampu at redhat.com>
> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > Sent: Saturday, May 9, 2015 12:52:33 PM
> > Subject: Re: [Gluster-devel] regression: brick crashed because of changelog
> > xlator init failure
> > 
> > On 05/09/2015 12:49 PM, Kotresh Hiremath Ravishankar wrote:
> > > If you observe the logs below. Socket binding failed because of Address
> > > and
> > > port already in use ERROR.
> > > Because of that changelog failed to initiate rpc server, hence failed.
> > > Not sure why socket binding failed in this machine.
> > >
> > > [2015-05-08 21:34:47.747059] E [socket.c:823:__socket_server_bind]
> > > 0-socket.patchy-changelog: binding to  failed: Address already in use
> > > [2015-05-08 21:34:47.747078] E [socket.c:826:__socket_server_bind]
> > > 0-socket.patchy-changelog: Port is already in use
> > > [2015-05-08 21:34:47.747096] W [rpcsvc.c:1602:rpcsvc_transport_create]
> > > 0-rpc-service: listening on transport failed
> > > [2015-05-08 21:34:47.747197] I [mem-pool.c:587:mem_pool_destroy]
> > > 0-patchy-changelog: size=116 max=0 total=0
> > > [2015-05-08 21:34:47.750460] E [xlator.c:426:xlator_init]
> > > 0-patchy-changelog: Initialization of volume 'patchy-changelog' failed,
> > > review your volfile again
> > > [2015-05-08 21:34:47.750485] E [graph.c:322:glusterfs_graph_init]
> > > 0-patchy-changelog: initializing translator failed
> > > [2015-05-08 21:34:47.750497] E [graph.c:661:glusterfs_graph_activate]
> > > 0-graph: init failed
> > > [2015-05-08 21:34:47.749020] I
> > > [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread
> > > with index 2
> > 
> > Irrespective of a socket bind failing, we should not crash. any ideas
> > why glusterfsd crashed?
> > 
> > -Vijay
> > 
> > 
> > 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 


More information about the Gluster-devel mailing list