[Gluster-users] Kernel oopses with gluster fuse on squeeze

Philip Poten philip.poten at gmail.com
Sun Dec 4 20:59:00 UTC 2011


Hi,

We've been experiencing repeated (every other day) oopses on hosts with
high load glusterfs accesses. Co-occurrent (not immediately tho, but there
is some sort of connection) are hanging nginx processes (doing the
accessing), which can not be stopped, killed and also block the shutdown of
the respective openvz instance. I think I remember at least one instance
where this occurred without a warning-oops. The only way to make things
working again is a reboot. A bit of googling led me to believe, that the
problem here might be with fuse, not glusterfs. The kernel bug that covers
this specific problem however is not available since the
kernel.orgbugzilla is down. Also, there are no indications in
gluster.log as to what
this may have caused. We're running Gluster 3.2.1 from debian packages
provided. The things I've tried are: using an older version of the kernel
(lenny backports) than squeeze, and disabling swap (since the process that
oopses is kswapd), the solution currently being tried is using nfs for high
load hosts, I hope this doesn't crash the gluster server :). This mail is
more of a JFYI than a bug report - perhaps somebody else has seen this
problem too and can provide more insight.

Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.037978] CPU 3
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038004] Modules linked in:
vzethdev vznetdev simfs vzrst vzcpt vzdquota vzmon vzdev xt_tcpudp xt_lengt
h xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport
xt_limit xt_dscp ipt_REJECT ip_tables x_tables ipmi_devintf ipmi_si
ipmi_msghan
dler nfs lockd fscache nfs_acl auth_rpcgss sunrpc 8021q garp bridge stp
fuse loop snd_pcm snd_timer snd soundcore snd_page_alloc psmouse dcdbas
pcspkr
 serio_raw joydev evdev power_meter button processor ext3 jbd mbcache
dm_mirror dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod sd_mod
crc_t10di
f sg sr_mod cdrom usbhid hid ata_generic ata_piix uhci_hcd ehci_hcd mptsas
mptscsih mptbase scsi_transport_sas libata usbcore nls_base scsi_mod bnx2
thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038545] Pid: 48, comm:
kswapd1 Not tainted 2.6.32-bpo.5-openvz-amd64 #1 feoktistov PowerEdge R410
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038602] RIP:
0010:[<ffffffff81102ade>]  [<ffffffff81102ade>] clear_inode+0x1b/0xd0
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038665] RSP:
0018:ffff88083c919c30  EFLAGS: 00010202
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038697] RAX:
0000000000000000 RBX: ffff8805d8f9b000 RCX: ffff88083c919c90
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038734] RDX:
0000000000000000 RSI: ffffea0016f46000 RDI: ffff8805d8f9b000
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038770] RBP:
0000000000000000 R08: ffffffffffffffc0 R09: 0000000000000000
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038806] R10:
0000000000000040 R11: 0000000000000002 R12: dead000000100100
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038842] R13:
ffff88083c919c90 R14: ffff8804314d3cf0 R15: ffff88083c919d24
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038879] FS:
 0000000000000000(0000) GS:ffff880011a20000(0000) knlGS:0000000000000000
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038933] CS:  0010 DS: 0018
ES: 0018 CR0: 000000008005003b
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.038966] CR2:
00007fe22466054c CR3: 0000000001001000 CR4: 00000000000006e0
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039003] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039039] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039076] Process kswapd1
(pid: 48, veid=0, threadinfo ffff88083c918000, task ffff88043d082000)
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039155]  ffff8805d8f9b000
ffffffff81103568 ffff8805d0213be0 ffff8805d0213bd8
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039199] <0>
ffff8805d0213be0 ffffffff810ff58b 0000000000000100 ffff8805d0213bd8
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039263] <0>
ffff8804314d3c00 ffffffff810ff820 0000000000000000 0000000000000008
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039375]
 [<ffffffff81103568>] ? generic_delete_inode+0xec/0x160
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039411]
 [<ffffffff810ff58b>] ? d_kill+0x40/0x61
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039443]
 [<ffffffff810ff820>] ? __shrink_dcache_sb+0x274/0x30b
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039479]
 [<ffffffff810ff9bc>] ? shrink_dcache_memory+0x105/0x216
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039516]
 [<ffffffff810c21c8>] ? shrink_slab+0x10e/0x189
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039550]
 [<ffffffff810c2a5d>] ? kswapd+0x4c3/0x659
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039583]
 [<ffffffff810bffa3>] ? isolate_pages_global+0x0/0x1ff
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039620]
 [<ffffffff8106680a>] ? autoremove_wake_function+0x0/0x2e
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039655]
 [<ffffffff810c259a>] ? kswapd+0x0/0x659
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039687]
 [<ffffffff8106653e>] ? kthread+0xc0/0xca
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039722]
 [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039754]
 [<ffffffff8106647e>] ? kthread+0x0/0xca
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.039786]
 [<ffffffff81011c60>] ? child_rip+0x0/0x20
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.040100]  RSP
<ffff88083c919c30>
Dec  2 19:46:50 hn-r410-openvz01 kernel: [ 9903.040496] ---[ end trace
ddd95abf24096674 ]---
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111204/1fe04303/attachment.html>


More information about the Gluster-users mailing list