[Gluster-devel] glusterfsd crash due to page allocation failure
David Robinson
david.robinson at corvidtec.com
Tue Dec 22 17:19:07 UTC 2015
sure... I'll setup a watch command to run this at some interval and send
the files.
David
------ Original Message ------
From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
To: "David Robinson" <drobinson at corvidtec.com>; "Glomski, Patrick"
<patrick.glomski at corvidtec.com>; gluster-devel at gluster.org;
gluster-users at gluster.org
Cc: "David Robinson" <david.robinson at corvidtec.com>
Sent: 12/22/2015 12:11:35 PM
Subject: Re: [Gluster-devel] glusterfsd crash due to page allocation
failure
>
>
>On 12/22/2015 09:10 PM, David Robinson wrote:
>>Pranith,
>>
>>This issue continues to happen. If you could provide instructions for
>>getting you the statedump, I would be happy to send that information.
>>I am not sure how to get a statedump just before the crash as the
>>crash is intermittent.
>Command: gluster volume statedump <volname>
>
>This generates statedump files in /var/run/gluster/ directory. Do you
>think you can execute this command once every 'X' time until the crash
>is hit? Post these files and hopefully that should be good enough to
>fix the problem.
>
>Pranith
>>
>>David
>>
>>
>>------ Original Message ------
>>From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>>To: "Glomski, Patrick" <patrick.glomski at corvidtec.com>;
>>gluster-devel at gluster.org; gluster-users at gluster.org
>>Cc: "David Robinson" <david.robinson at corvidtec.com>
>>Sent: 12/21/2015 11:59:33 PM
>>Subject: Re: [Gluster-devel] glusterfsd crash due to page allocation
>>failure
>>
>>>hi Glomski,
>>> This is the second time I am hearing about memory allocation
>>>problems in 3.7.6 but this time on brick side. Are you able to
>>>recreate this issue? Will it be possible to get statedumps of the
>>>bricks processes just before they crash?
>>>
>>>Pranith
>>>
>>>On 12/22/2015 02:25 AM, Glomski, Patrick wrote:
>>>>Hello,
>>>>
>>>>We've recently upgraded from gluster 3.6.6 to 3.7.6 and have started
>>>>encountering dmesg page allocation errors (stack trace is appended).
>>>>
>>>>It appears that glusterfsd now sometimes fills up the cache
>>>>completely and crashes with a page allocation failure. I *believe*
>>>>it mainly happens when copying lots of new data to the system,
>>>>running a 'find', or similar. Hosts are all Scientific Linux 6.6 and
>>>>these errors occur consistently on two separate gluster pools.
>>>>
>>>>Has anyone else seen this issue and are there any known fixes for it
>>>>via sysctl kernel parameters or other means?
>>>>
>>>>Please let me know of any other diagnostic information that would
>>>>help.
>>>>
>>>>Thanks,
>>>>Patrick
>>>>
>>>>
>>>>>[1458118.134697] glusterfsd: page allocation failure. order:5,
>>>>>mode:0x20
>>>>>[1458118.134701] Pid: 6010, comm: glusterfsd Not tainted
>>>>>2.6.32-573.3.1.el6.x86_64 #1
>>>>>[1458118.134702] Call Trace:
>>>>>[1458118.134714] [<ffffffff8113770c>] ?
>>>>>__alloc_pages_nodemask+0x7dc/0x950
>>>>>[1458118.134728] [<ffffffffa0321800>] ?
>>>>>mlx4_ib_post_send+0x680/0x1f90 [mlx4_ib]
>>>>>[1458118.134733] [<ffffffff81176e92>] ? kmem_getpages+0x62/0x170
>>>>>[1458118.134735] [<ffffffff81177aaa>] ? fallback_alloc+0x1ba/0x270
>>>>>[1458118.134736] [<ffffffff811774ff>] ? cache_grow+0x2cf/0x320
>>>>>[1458118.134738] [<ffffffff81177829>] ?
>>>>>____cache_alloc_node+0x99/0x160
>>>>>[1458118.134743] [<ffffffff8145f732>] ?
>>>>>pskb_expand_head+0x62/0x280
>>>>>[1458118.134744] [<ffffffff81178479>] ? __kmalloc+0x199/0x230
>>>>>[1458118.134746] [<ffffffff8145f732>] ?
>>>>>pskb_expand_head+0x62/0x280
>>>>>[1458118.134748] [<ffffffff8146001a>] ?
>>>>>__pskb_pull_tail+0x2aa/0x360
>>>>>[1458118.134751] [<ffffffff8146f389>] ?
>>>>>harmonize_features+0x29/0x70
>>>>>[1458118.134753] [<ffffffff8146f9f4>] ?
>>>>>dev_hard_start_xmit+0x1c4/0x490
>>>>>[1458118.134758] [<ffffffff8148cf8a>] ?
>>>>>sch_direct_xmit+0x15a/0x1c0
>>>>>[1458118.134759] [<ffffffff8146ff68>] ? dev_queue_xmit+0x228/0x320
>>>>>[1458118.134762] [<ffffffff8147665d>] ?
>>>>>neigh_connected_output+0xbd/0x100
>>>>>[1458118.134766] [<ffffffff814abc67>] ?
>>>>>ip_finish_output+0x287/0x360
>>>>>[1458118.134767] [<ffffffff814abdf8>] ? ip_output+0xb8/0xc0
>>>>>[1458118.134769] [<ffffffff814ab04f>] ? __ip_local_out+0x9f/0xb0
>>>>>[1458118.134770] [<ffffffff814ab085>] ? ip_local_out+0x25/0x30
>>>>>[1458118.134772] [<ffffffff814ab580>] ? ip_queue_xmit+0x190/0x420
>>>>>[1458118.134773] [<ffffffff81137059>] ?
>>>>>__alloc_pages_nodemask+0x129/0x950
>>>>>[1458118.134776] [<ffffffff814c0c54>] ?
>>>>>tcp_transmit_skb+0x4b4/0x8b0
>>>>>[1458118.134778] [<ffffffff814c319a>] ? tcp_write_xmit+0x1da/0xa90
>>>>>[1458118.134779] [<ffffffff81178cbd>] ? __kmalloc_node+0x4d/0x60
>>>>>[1458118.134780] [<ffffffff814c3a80>] ? tcp_push_one+0x30/0x40
>>>>>[1458118.134782] [<ffffffff814b410c>] ? tcp_sendmsg+0x9cc/0xa20
>>>>>[1458118.134786] [<ffffffff8145836b>] ? sock_aio_write+0x19b/0x1c0
>>>>>[1458118.134788] [<ffffffff814581d0>] ? sock_aio_write+0x0/0x1c0
>>>>>[1458118.134791] [<ffffffff8119169b>] ?
>>>>>do_sync_readv_writev+0xfb/0x140
>>>>>[1458118.134797] [<ffffffff810a14b0>] ?
>>>>>autoremove_wake_function+0x0/0x40
>>>>>[1458118.134801] [<ffffffff8123e92f>] ?
>>>>>selinux_file_permission+0xbf/0x150
>>>>>[1458118.134804] [<ffffffff812316d6>] ?
>>>>>security_file_permission+0x16/0x20
>>>>>[1458118.134806] [<ffffffff81192746>] ? do_readv_writev+0xd6/0x1f0
>>>>>[1458118.134807] [<ffffffff811928a6>] ? vfs_writev+0x46/0x60
>>>>>[1458118.134809] [<ffffffff811929d1>] ? sys_writev+0x51/0xd0
>>>>>[1458118.134812] [<ffffffff810e88ae>] ?
>>>>>__audit_syscall_exit+0x25e/0x290
>>>>>[1458118.134816] [<ffffffff8100b0d2>] ?
>>>>>system_call_fastpath+0x16/0x1b
>>>>
>>>>
>>>>
>>>>_______________________________________________ Gluster-devel
>>>>mailing list
>>>>Gluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20151222/dfaee129/attachment.html>
More information about the Gluster-devel
mailing list