[Gluster-users] [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: 6)
Song
gluster at 163.com
Mon Dec 2 09:19:26 UTC 2013
Pranith,
Another kind of client crash happened, gdb information is as below for you reference:
Core was generated by `/usr/sbin/glusterfs --log-level=INFO --volfile-id=gfs6 --volfile-server=bj-nx-c'.
Program terminated with signal 11, Segmentation fault.
#0 afr_frame_return (frame=<value optimized out>) at afr-common.c:983
983 call_count = --local->call_count;
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6.x86_64 libgcc-4.4.6-3.el6.x86_64 openssl-1.0.0-20.el6.x86_64 zlib-1.2.3-27.el6.x86_64
(gdb) where
#0 afr_frame_return (frame=<value optimized out>) at afr-common.c:983
#1 0x00007f8aa1c1ebbc in afr_sh_entry_impunge_parent_setattr_cbk (setattr_frame=0x7f8aa525b248, cookie=<value optimized out>, this=0x1a82e00, op_ret=<value optimized out>,
op_errno=<value optimized out>, preop=<value optimized out>, postop=0x0, xdata=0x0) at afr-self-heal-entry.c:970
#2 0x00007f8aa1e5fecb in client3_1_setattr (frame=0x7f8aa54ec634, this=<value optimized out>, data=<value optimized out>) at client3_1-fops.c:5801
#3 0x00007f8aa1e58b41 in client_setattr (frame=0x7f8aa54ec634, this=<value optimized out>, loc=<value optimized out>, stbuf=<value optimized out>, valid=<value optimized out>,
xdata=<value optimized out>) at client.c:1915
#4 0x00007f8aa1c1f080 in afr_sh_entry_impunge_setattr (impunge_frame=0x7f8aa5454e10, this=<value optimized out>) at afr-self-heal-entry.c:1017
#5 0x00007f8aa1c1f5c0 in afr_sh_entry_impunge_xattrop_cbk (impunge_frame=0x7f8aa5454e10, cookie=0x1, this=0x1a82e00, op_ret=<value optimized out>, op_errno=22, xattr=<value optimized out>,
xdata=0x0) at afr-self-heal-entry.c:1067
#6 0x00007f8aa1e6b34e in client3_1_xattrop_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7f8aa54ad5b8) at client3_1-fops.c:1715
#7 0x00000037eba0f4e5 in rpc_clnt_handle_reply (clnt=0x1eaccd0, pollin=0x2fba390) at rpc-clnt.c:786
#8 0x00000037eba0fce0 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x1eacd00, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:905
#9 0x00000037eba0aeb8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:489
#10 0x00007f8aa2cb5764 in socket_event_poll_in (this=0x1ebc730) at socket.c:1677
#11 0x00007f8aa2cb5847 in socket_event_handler (fd=<value optimized out>, idx=127, data=0x1ebc730, poll_in=1, poll_out=0, poll_err=<value optimized out>) at socket.c:1792
#12 0x00000037eb63e464 in event_dispatch_epoll_handler (event_pool=0x19eddf0) at event.c:785
#13 event_dispatch_epoll (event_pool=0x19eddf0) at event.c:847
#14 0x000000000040736a in main (argc=<value optimized out>, argv=0x7fff26cdcd78) at glusterfsd.c:1689
-----Original Message-----
From: Song [mailto:gluster at 163.com]
Sent: Monday, October 28, 2013 11:25 AM
To: 'Pranith Kumar Karampuri'
Cc: 'John Mark Walker'; 'gluster-users at gluster.org'
Subject: RE: [Gluster-users] [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: 6)
Pranith,
Another similar client crash happened. Following are the glusterfs log and gdb output for your reference.
pending frames:
frame : type(1) op(STATFS)
frame : type(1) op(STATFS)
patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 2013-10-28 00:41:53
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.1
/lib64/libc.so.6[0x3a0c432900]
/lib64/libc.so.6(gsignal+0x35)[0x3a0c432885]
/lib64/libc.so.6(abort+0x175)[0x3a0c434065]
/lib64/libc.so.6[0x3a0c46f7a7]
/lib64/libc.so.6[0x3a0c4750c6]
/usr/lib/libglusterfs.so.0(gf_timer_call_cancel+0xb0)[0x328b42a180]
/usr/lib/glusterfs/3.3.1/xlator/protocol/client.so(client_ping_cbk+0x6d)[0x7f3514afe54d]
/usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x328b80f4e5]
/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x328b80fce0]
/usr/lib/libgfrpc.so.0(rpc_transport_notify+0x28)[0x328b80aeb8]
/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7f351593a764]
/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_handler+0xc7)[0x7f351593a847]
/usr/lib/libglusterfs.so.0[0x328b43e464]
/usr/sbin/glusterfs(main+0x58a)[0x40736a]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3a0c41ecdd]
/usr/sbin/glusterfs[0x4042d9]
---------
(gdb) where
#0 0x0000003a0c432885 in raise () from /lib64/libc.so.6
#1 0x0000003a0c434065 in abort () from /lib64/libc.so.6
#2 0x0000003a0c46f7a7 in __libc_message () from /lib64/libc.so.6
#3 0x0000003a0c4750c6 in malloc_printerr () from /lib64/libc.so.6
#4 0x000000328b42a180 in gf_timer_call_cancel (ctx=<value optimized out>, event=0x7f34f0001730) at timer.c:122
#5 0x00007f3514afe54d in client_ping_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7f3517f0751c) at client-handshake.c:285
#6 0x000000328b80f4e5 in rpc_clnt_handle_reply (clnt=0x1890aa0, pollin=0x1e7acb0) at rpc-clnt.c:786
#7 0x000000328b80fce0 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x1890ad0, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:905
#8 0x000000328b80aeb8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:489
#9 0x00007f351593a764 in socket_event_poll_in (this=0x18a0500) at socket.c:1677
#10 0x00007f351593a847 in socket_event_handler (fd=<value optimized out>, idx=41, data=0x18a0500, poll_in=1, poll_out=0, poll_err=<value optimized out>) at socket.c:1792
#11 0x000000328b43e464 in event_dispatch_epoll_handler (event_pool=0x930df0) at event.c:785
#12 event_dispatch_epoll (event_pool=0x930df0) at event.c:847
#13 0x000000000040736a in main (argc=<value optimized out>, argv=0x7fff829eac78) at glusterfsd.c:1689
-----Original Message-----
From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
Sent: Friday, October 25, 2013 2:00 PM
To: Song
Cc: John Mark Walker; gluster-users at gluster.org
Subject: Re: [Gluster-users] [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: 6)
Thanks for this information. Let us see if we can re-create the issue in our environment. If that does not help, we shall do a detailed analysis of the code to figure this out.
Pranith
----- Original Message -----
> From: "Song" <gluster at 163.com>
> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> Cc: "John Mark Walker" <johnmark at gluster.org>,
> gluster-users at gluster.org
> Sent: Wednesday, October 23, 2013 2:53:03 PM
> Subject: RE: [Gluster-users] [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: 6)
>
> Pranith,
>
> Thanks for your detail answer.
>
> Our workload includes CREATE/WRITE/READ/STAT/ACCESS, as well as
> chmod(filepath, 0). While I don't know which kind of workload lead to
> the crash.
> We have analyzed the related code such as dict, lookup of cluster/afr,
> lookup of protocol/client and have nothing useful information to help
> locate the issues.
>
> Song.
>
> -----Original Message-----
> From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
> Sent: Tuesday, October 22, 2013 5:25 PM
> To: Song
> Cc: John Mark Walker; gluster-users at gluster.org
> Subject: Re: [Gluster-users] [Gluster-devel] GlusterFS 3.3.1 client
> crash (signal received: 6)
>
> Song,
> The information printed in that function gf_print_trace has been useful
> in the sense that we know it happens when there is a double 'memput' of
> one of the data structures as part of 'lookup'. The problem is this
> issue seems to be happening only in some peculiar case, which
> unfortunately you are hitting every day on 1-2 clients. That is why I
> was trying to figure out what the workload is.
>
> Let me tell you what I mean by 'workload' is.
> For example:
> For websites which do some kind of image manipulation. They generally
> CREATE temporary information and do some transformations i.e.
> READS/WRITES and then RENAME them to the actual files.
> So here the work load is CREATE/READ/WRITE/RENAME intensive.
>
> To give you one more example:
> VM image hosting(At least with the KVM images that I test generally),
> On each VM image it pretty much does WRITES, READs, STATs so it is
> WRITEs/STATs/READs intensive.
>
> I would really like to know what kind of workload happens on your
> setup to figure out what is that peculiar thing that may lead to this crash.
>
> Pranith.
>
> ----- Original Message -----
> > From: "Song" <gluster at 163.com>
> > To: "Song" <gluster at 163.com>, "John Mark Walker"
> > <johnmark at gluster.org>, "Pranith Kumar Karampuri"
> > <pkarampu at redhat.com>
> > Cc: gluster-users at gluster.org
> > Sent: Tuesday, October 22, 2013 1:56:48 PM
> > Subject: RE: [Gluster-users] [Gluster-devel] GlusterFS 3.3.1 client crash
> > (signal received: 6)
> >
> > To locate this issue, is it possible to print more useful
> > information in backtrace?
> > When client crashed, trace information was printed. Which is coded
> > in function of "gf_print_trace", in common-utils.c.
> > I hope that some helpful debug information would be appended in this
> > function and when client crash next time, the data can help us to
> > analyze the problem.
> >
> > Could you give me the suggestion what codes is useful?
> > Thanks!
> >
> > -----Original Message-----
> > From: gluster-users-bounces at gluster.org
> > [mailto:gluster-users-bounces at gluster.org] On Behalf Of Song
> > Sent: Friday, September 06, 2013 10:17 AM
> > To: 'John Mark Walker'; 'Pranith Kumar Karampuri'
> > Cc: gluster-users at gluster.org
> > Subject: Re: [Gluster-users] [Gluster-devel] GlusterFS 3.3.1 client
> > crash (signal received: 6)
> >
> > It's a pity I don't know how to re-create the issue. While there are
> > 1-2 crashed clients in total 120 clients every day.
> >
> > Below is gdb result:
> >
> > (gdb) where
> > #0 0x0000003267432885 in raise () from /lib64/libc.so.6
> > #1 0x0000003267434065 in abort () from /lib64/libc.so.6
> > #2 0x000000326746f7a7 in __libc_message () from /lib64/libc.so.6
> > #3 0x00000032674750c6 in malloc_printerr () from /lib64/libc.so.6
> > #4 0x00007fc4f2847684 in mem_put (ptr=0x7fc4b0a4c03c) at
> > mem-pool.c:559
> > #5 0x00007fc4f281cc9b in dict_destroy (this=0x7fc4f12cc5cc) at
> > dict.c:397
> > #6 0x00007fc4ede24c30 in afr_local_cleanup (local=0x7fc4ce68ac20,
> > this=<value optimized out>) at afr-common.c:848
> > #7 0x00007fc4ede2c0f1 in afr_lookup_done (frame=0x18d5ae4,
> > cookie=0x0, this=<value optimized out>, op_ret=<value optimized
> > out>, op_errno=<value optimized out>, inode=0x18d5b20,
> > buf=0x7fffcb83ec50, xattr=0x7fc4f12e1818,
> > postparent=0x7fffcb83ebe0) at
> > afr-common.c:1881
> > #8 afr_lookup_cbk (frame=0x18d5ae4, cookie=0x0, this=<value
> > optimized
> > out>, op_ret=<value optimized out>, op_errno=<value optimized out>,
> > inode=0x18d5b20, buf=0x7fffcb83ec50,
> > xattr=0x7fc4f12e1818, postparent=0x7fffcb83ebe0) at
> > afr-common.c:2044
> > #9 0x00007fc4ee066550 in client3_1_lookup_cbk (req=<value optimized
> > out>, iov=<value optimized out>, count=<value optimized out>,
> > myframe=0x7fc4f16f390c) at client3_1-fops.c:2636
> > #10 0x00007fc4f25ff4e5 in rpc_clnt_handle_reply (clnt=0x3b5c600,
> > pollin=0x6ba00f0) at rpc-clnt.c:786
> > #11 0x00007fc4f25ffce0 in rpc_clnt_notify (trans=<value optimized
> > out>, mydata=0x3b5c630, event=<value optimized out>, data=<value
> > optimized out>) at rpc-clnt.c:905
> > #12 0x00007fc4f25faeb8 in rpc_transport_notify (this=<value
> > optimized
> > out>, event=<value optimized out>, data=<value optimized out>) at
> > rpc-transport.c:489
> > #13 0x00007fc4eeeb0764 in socket_event_poll_in (this=0x3b6c060) at
> > socket.c:1677
> > #14 0x00007fc4eeeb0847 in socket_event_handler (fd=<value optimized
> > out>, idx=265, data=0x3b6c060, poll_in=1, poll_out=0,
> > out>poll_err=<value
> > optimized
> > out>) at socket.c:1792
> > #15 0x00007fc4f2846464 in event_dispatch_epoll_handler
> > (event_pool=0x177cdf0) at event.c:785
> > #16 event_dispatch_epoll (event_pool=0x177cdf0) at event.c:847
> > #17 0x000000000040736a in main (argc=<value optimized out>,
> > argv=0x7fffcb83efc8) at glusterfsd.c:1689
> >
> >
> > -----Original Message-----
> > From: jowalker at redhat.com [mailto:jowalker at redhat.com] On Behalf Of
> > John Mark Walker
> > Sent: Thursday, September 05, 2013 1:06 PM
> > To: Pranith Kumar Karampuri
> > Cc: Song; gluster-devel at nongnu.org
> > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client crash (signal received:
> > 6)
> >
> > Posting to gluster-users.
> >
> >
> > ----- Pranith Kumar Karampuri <pkarampu at redhat.com> wrote:
> > > Song,
> > > Seems like the issue is happening because of double 'memput',
> > > Could you
> > let us know the steps to re-create the issue? Or the load that may
> > lead to this?
> > >
> > > Pranith
> > >
> > > ----- Original Message -----
> > > > From: "Song" <gluster at 163.com>
> > > > To: gluster-devel at nongnu.org
> > > > Sent: Thursday, September 5, 2013 8:05:57 AM
> > > > Subject: [Gluster-devel] GlusterFS 3.3.1 client crash (signal
> > > > received: 6)
> > > >
> > > >
> > > >
> > > > I installed GlusterFS 3.3.1 in my 24 servers, created a DHT+AFR
> > > > volume and mounted it with native client.
> > > >
> > > > Recently, some glusterfs clients is crashed, the log is as below.
> > > >
> > > >
> > > >
> > > > The OS is 64bit CentOS6.2, kernel version:
> > > > 2.6.32-220.23.1.el6.x86_64 #1 SMP Fri Jun 28 00:56:49 CST 2013
> > > > x86_64 x86_64 x86_64 GNU/Linux
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > pending frames:
> > > >
> > > > frame : type(1) op(LOOKUP)
> > > >
> > > > frame : type(1) op(LOOKUP)
> > > >
> > > > frame : type(1) op(LOOKUP)
> > > >
> > > >
> > > >
> > > > patchset: git://git.gluster.com/glusterfs.git
> > > >
> > > > signal received: 6
> > > >
> > > > time of crash: 2013-09-05 00:37:40
> > > >
> > > > configuration details:
> > > >
> > > > argp 1
> > > >
> > > > backtrace 1
> > > >
> > > > dlfcn 1
> > > >
> > > > fdatasync 1
> > > >
> > > > libpthread 1
> > > >
> > > > llistxattr 1
> > > >
> > > > setfsid 1
> > > >
> > > > spinlock 1
> > > >
> > > > epoll.h 1
> > > >
> > > > xattr.h 1
> > > >
> > > > st_atim.tv_nsec 1
> > > >
> > > > package-string: glusterfs 3.3.1
> > > >
> > > > /lib64/libc.so.6[0x3ac0232900]
> > > >
> > > > /lib64/libc.so.6(gsignal+0x35)[0x3ac0232885]
> > > >
> > > > /lib64/libc.so.6(abort+0x175)[0x3ac0234065]
> > > >
> > > > /lib64/libc.so.6[0x3ac026f7a7]
> > > >
> > > > /lib64/libc.so.6[0x3ac02750c6]
> > > >
> > > > /usr/lib/libglusterfs.so.0(mem_put+0x64)[0x7f3f99c2c684]
> > > >
> > > > /usr/lib/glusterfs/3.3.1/xlator/cluster/replicate.so(afr_local_c
> > > > le
> > > > an
> > > > up+0x60)[0x7f3f95209c30]
> > > >
> > > > /usr/lib/glusterfs/3.3.1/xlator/cluster/replicate.so(afr_lookup_
> > > > cb
> > > > k+
> > > > 0x5a1)[0x7f3f952110f1]
> > > >
> > > > /usr/lib/glusterfs/3.3.1/xlator/protocol/client.so(client3_1_loo
> > > > ku
> > > > p_
> > > > cbk+0x6b0)[0x7f3f9544b550]
> > > >
> > > > /usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f3f999e44e
> > > > 5]
> > > >
> > > > /usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f3f999e4ce0]
> > > >
> > > > /usr/lib/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f3f999dfeb8
> > > > ]
> > > >
> > > > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_po
> > > > ll
> > > > _i
> > > > n+0x34)[0x7f3f96295764]
> > > >
> > > > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_ha
> > > > nd
> > > > le
> > > > r+0xc7)[0x7f3f96295847]
> > > >
> > > > /usr/lib/libglusterfs.so.0(+0x3e464)[0x7f3f99c2b464]
> > > >
> > > > /usr/sbin/glusterfs(main+0x58a)[0x40736a]
> > > >
> > > > /lib64/libc.so.6(__libc_start_main+0xfd)[0x3ac021ecdd]
> > > >
> > > > /usr/sbin/glusterfs[0x4042d9]
> > > >
> > > > ---------
> > > >
> > > >
> > > >
> > > > Best regards.
> > > >
> > > > Willard Song
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Gluster-devel mailing list
> > > > Gluster-devel at nongnu.org
> > > > https://lists.nongnu.org/mailman/listinfo/gluster-devel
> > > >
> > >
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel at nongnu.org
> > > https://lists.nongnu.org/mailman/listinfo/gluster-devel
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >
> >
> >
>
>
>
More information about the Gluster-users
mailing list