[Gluster-devel] v3.4.0a3+ NFS crashing out

Michael Brown michael at netdirect.ca
Wed May 1 17:27:59 UTC 2013


My gluster NFS daemon is crashing with the following:

pending frames:
<<<25592 copies of>>>
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-05-01 17:02:36configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4git
/usr/local/glusterfs/sbin/glusterfs(glusterfsd_print_trace+0x1f)[0x407bd5]
/lib64/libc.so.6[0x3c48c32920]
/lib64/libc.so.6[0x3c48c7870a]
/usr/local/glusterfs/lib/libglusterfs.so.0(__gf_free+0x61)[0x7f80421665a9]
/usr/local/glusterfs/lib/libglusterfs.so.0(mem_put+0x212)[0x7f8042166fd8]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_writev_done+0xca)[0x7f803d8cf9ec]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(+0x58d7f)[0x7f803d900d7f]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(+0x58f09)[0x7f803d900f09]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(+0x59214)[0x7f803d901214]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_unlock+0x57)[0x7f803d905aeb]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_changelog_post_op_cbk+0x10a)[0x7f803d8dd01f]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_changelog_post_op_now+0x8c7)[0x7f803d8ddebf]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_delayed_changelog_post_op+0x16e)[0x7f803d8e1f36]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_changelog_post_op+0x59)[0x7f803d8e1f99]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_transaction_resume+0x87)[0x7f803d8e205e]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/cluster/replicate.so(afr_writev_wind_cbk+0x348)[0x7f803d8cf468]
/usr/local/glusterfs/lib/glusterfs/3.4git/xlator/protocol/client.so(client3_3_writev_cbk+0x490)[0x7f803db53397]
/usr/local/glusterfs/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x1b5)[0x7f8041f14759]
/usr/local/glusterfs/lib/libgfrpc.so.0(rpc_clnt_notify+0x2d3)[0x7f8041f14af0]
/usr/local/glusterfs/lib/libgfrpc.so.0(rpc_transport_notify+0x110)[0x7f8041f1118c]
/usr/local/glusterfs/lib/glusterfs/3.4git/rpc-transport/socket.so(socket_event_poll_in+0x54)[0x7f803e9a40a9]
/usr/local/glusterfs/lib/glusterfs/3.4git/rpc-transport/socket.so(socket_event_handler+0x1c4)[0x7f803e9a4558]
/usr/local/glusterfs/lib/libglusterfs.so.0(+0x72441)[0x7f8042190441]
/usr/local/glusterfs/lib/libglusterfs.so.0(+0x72630)[0x7f8042190630]
/usr/local/glusterfs/lib/libglusterfs.so.0(event_dispatch+0x6c)[0x7f8042165af3]
/usr/local/glusterfs/sbin/glusterfs(main+0x2c7)[0x408503]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3c48c1ecdd]
/usr/local/glusterfs/sbin/glusterfs[0x404649]
---------

It rather looks like the nfs code isn't freeing up NULL frames from the
frame stack (if those words are right :D) when it's done replying to them.

Yes, Oracle does send quite a few. Up until that, it was behaving REALLY
well :)

M.

-- 
Michael Brown               | `One of the main causes of the fall of
Systems Consultant          | the Roman Empire was that, lacking zero,
Net Direct Inc.             | they had no way to indicate successful
☎: +1 519 883 1172 x5106    | termination of their C programs.' - Firth





More information about the Gluster-devel mailing list