[Bugs] [Bug 1401021] New: OOM kill of nfs-ganesha on one node while fs-sanity test suite is executed.
bugzilla at redhat.com
bugzilla at redhat.com
Fri Dec 2 15:06:15 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1401021
Bug ID: 1401021
Summary: OOM kill of nfs-ganesha on one node while fs-sanity
test suite is executed.
Product: GlusterFS
Version: 3.9
Component: distribute
Keywords: Triaged
Severity: urgent
Assignee: bugs at gluster.org
Reporter: jthottan at redhat.com
CC: aloganat at redhat.com, bugs at gluster.org,
jthottan at redhat.com, kkeithle at redhat.com,
mzywusko at redhat.com, ndevos at redhat.com,
rhs-bugs at redhat.com, sbhaloth at redhat.com,
skoduri at redhat.com, sraj at redhat.com,
storage-qa-internal at redhat.com
Depends On: 1381452, 1397052
+++ This bug was initially created as a clone of Bug #1397052 +++
+++ This bug was initially created as a clone of Bug #1381452 +++
Description of problem:
OOM kill of nfs-ganesha on one node while posix_compliance test suite is
executed.
Version-Release number of selected component (if applicable):
[root at dhcp42-59 ~]# rpm -qa|grep ganesha
nfs-ganesha-2.4.0-2.el6rhs.x86_64
nfs-ganesha-gluster-2.4.0-2.el6rhs.x86_64
glusterfs-ganesha-3.8.4-2.el6rhs.x86_64
How reproducible:
Once
Steps to Reproduce:
1. Create a ganesha cluster, create a volume and enable ganesha on it.
2. Mount the volume with ver=4 on client and start executing posix_compliance
test suite.
3. Observe that once the posix_compliance test suite is finished, ganesha gets
oom_killed on the mounted node with below messages in dmesg:
pcs invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
pcs cpuset=/ mems_allowed=0
Pid: 3248, comm: pcs Not tainted 2.6.32-642.4.2.el6.x86_64 #1
Call Trace:
[<ffffffff81131420>] ? dump_header+0x90/0x1b0
[<ffffffff8123bfec>] ? security_real_capable_noaudit+0x3c/0x70
[<ffffffff811318a2>] ? oom_kill_process+0x82/0x2a0
[<ffffffff811317e1>] ? select_bad_process+0xe1/0x120
[<ffffffff81131ce0>] ? out_of_memory+0x220/0x3c0
[<ffffffff8113e6bc>] ? __alloc_pages_nodemask+0x93c/0x950
[<ffffffff81177a0a>] ? alloc_pages_vma+0x9a/0x150
[<ffffffff81159d8d>] ? handle_pte_fault+0x73d/0xb20
[<ffffffff810567c7>] ? pte_alloc_one+0x37/0x50
[<ffffffff81193f79>] ? do_huge_pmd_anonymous_page+0xb9/0x3b0
[<ffffffff8115a409>] ? handle_mm_fault+0x299/0x3d0
[<ffffffff81052156>] ? __do_page_fault+0x146/0x500
[<ffffffff811609d5>] ? do_mmap_pgoff+0x335/0x380
[<ffffffff8154f03e>] ? do_page_fault+0x3e/0xa0
[<ffffffff8154c345>] ? page_fault+0x25/0x30
Mem-Info:
Node 0 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
Node 0 DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 0
CPU 3: hi: 186, btch: 31 usd: 0
Node 0 Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 30
CPU 3: hi: 186, btch: 31 usd: 92
active_anon:1618270 inactive_anon:292236 isolated_anon:0
active_file:0 inactive_file:12 isolated_file:0
unevictable:22804 dirty:13 writeback:0 unstable:0
free:25293 slab_reclaimable:4863 slab_unreclaimable:20291
mapped:12015 shmem:14677 pagetables:7894 bounce:0
Node 0 DMA free:15716kB min:124kB low:152kB high:184kB active_anon:0kB
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15320kB mlocked:0kB dirty:0kB
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 3512 8057 8057
Node 0 DMA32 free:47568kB min:29404kB low:36752kB high:44104kB
active_anon:2723100kB inactive_anon:544028kB active_file:4kB inactive_file:28kB
unevictable:24kB isolated(anon):0kB isolated(file):0kB present:3596500kB
mlocked:24kB dirty:4kB writeback:0kB mapped:328kB shmem:4kB
slab_reclaimable:48kB slab_unreclaimable:196kB kernel_stack:0kB
pagetables:8600kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64
all_unreclaimable? yes
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:37888kB min:38052kB low:47564kB high:57076kB
active_anon:3749980kB inactive_anon:624916kB active_file:0kB inactive_file:20kB
unevictable:91192kB isolated(anon):0kB isolated(file):0kB present:4654080kB
mlocked:62576kB dirty:48kB writeback:0kB mapped:47732kB shmem:58704kB
slab_reclaimable:19404kB slab_unreclaimable:80968kB kernel_stack:11680kB
pagetables:22976kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:313
all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 3*4kB 1*8kB 1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB
1*2048kB 3*4096kB = 15716kB
Node 0 DMA32: 376*4kB 137*8kB 37*16kB 31*32kB 33*64kB 5*128kB 4*256kB 4*512kB
1*1024kB 2*2048kB 8*4096kB = 47896kB
Node 0 Normal: 8430*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB
1*1024kB 1*2048kB 0*4096kB = 37800kB
39246 total pagecache pages
22008 pages in swap cache
Swap cache stats: add 1306301, delete 1284293, find 109183/161893
Free swap = 0kB
Total swap = 3145724kB
2097151 pages RAM
82361 pages reserved
28737 pages shared
1970987 pages non-shared
[ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
[ 641] 0 641 2776 105 1 -17 -1000 udevd
[ 1362] 0 1362 2281 83 2 0 0 dhclient
[ 1426] 0 1426 6900 158 2 -17 -1000 auditd
[ 1494] 0 1494 4578 139 2 0 0 irqbalance
[ 1512] 32 1512 4746 148 2 0 0 rpcbind
[ 1568] 81 1568 24364 226 2 0 0 dbus-daemon
[ 1590] 0 1590 47244 238 0 0 0 cupsd
[ 1622] 0 1622 1021 133 0 0 0 acpid
[ 1634] 68 1634 9500 305 2 0 0 hald
[ 1635] 0 1635 5101 132 0 0 0 hald-runner
[ 1667] 0 1667 5631 118 3 0 0 hald-addon-inpu
[ 1674] 68 1674 4503 161 3 0 0 hald-addon-acpi
[ 1701] 0 1701 96537 251 0 0 0 automount
[ 6072] 0 6072 1698 35 1 0 0 mcelog
[ 6089] 0 6089 16560 89 2 -17 -1000 sshd
[ 6168] 0 6168 20226 287 2 0 0 master
[ 6172] 89 6172 20289 273 2 0 0 qmgr
[ 6197] 0 6197 45773 220 2 0 0 abrtd
[ 6224] 0 6224 5278 71 2 0 0 atd
[ 6238] 0 6238 25235 113 1 0 0 rhnsd
[ 6250] 0 6250 27088 99 2 0 0 rhsmcertd
[ 6267] 0 6267 16092 105 2 0 0 certmonger
[ 6312] 0 6312 1017 115 2 0 0 mingetty
[ 6314] 0 6314 1017 115 2 0 0 mingetty
[ 6316] 0 6316 1017 115 0 0 0 mingetty
[ 6318] 0 6318 1017 115 2 0 0 mingetty
[ 6320] 0 6320 1017 115 0 0 0 mingetty
[ 6322] 0 6322 1017 115 2 0 0 mingetty
[ 8731] 0 8731 29221 197 2 0 0 crond
[ 8756] 0 8756 62806 244 0 0 0 rsyslogd
[ 9346] 0 9346 167548 405 0 0 0 glusterd
[10370] 0 10370 373001 555 1 0 0 glusterfsd
[10473] 0 10473 211916 436 3 0 0 glusterfs
[11607] 0 11607 15925 197 3 0 0 check_gluster_s
[25769] 0 25769 175887 7686 0 -17 -1000 dmeventd
[27899] 0 27899 44284 284 2 0 0 tuned
[ 8055] 0 8055 5852 186 1 0 0 rpc.statd
[26157] 0 26157 25553 340 2 0 0 sshd
[26258] 0 26258 27089 204 2 0 0 bash
[29145] 0 29145 320209 3119 0 0 0 glusterfsd
[29165] 0 29165 320210 3635 0 0 0 glusterfsd
[29185] 0 29185 319695 4118 3 0 0 glusterfsd
[29206] 0 29206 263494 945 1 0 0 glusterfs
[31559] 0 31559 2973 100 0 -17 -1000 udevd
[31904] 0 31904 3793222 1859244 1 0 0 ganesha.nfsd
[32496] 0 32496 2972 97 2 -17 -1000 udevd
[ 444] 0 444 152050 19556 2 0 0 corosync
[ 510] 0 510 49291 143 2 -16 -941 fenced
[ 525] 0 525 52509 138 1 -16 -941 dlm_controld
[ 589] 0 589 32284 133 2 -16 -941 gfs_controld
[ 751] 0 751 20100 313 2 0 0 pacemakerd
[ 758] 189 758 23995 1230 2 0 0 cib
[ 759] 0 759 23889 346 2 0 0 stonithd
[ 760] 0 760 15627 365 2 0 0 lrmd
[ 761] 189 761 21431 392 2 0 0 attrd
[ 762] 189 762 29577 740 0 0 0 pengine
[ 763] 0 763 34234 737 2 0 0 crmd
[ 790] 0 790 35543 10087 3 0 0 ruby
[ 3248] 0 3248 45176 1674 2 0 0 pcs
Out of memory: Kill process 31904 (ganesha.nfsd) score 884 or sacrifice child
Killed process 31904, UID 0, (ganesha.nfsd) total-vm:15172888kB,
anon-rss:7435736kB, file-rss:1240kB
Actual results:
OOM kill of nfs-ganesha on one node while posix_compliance test suite is
executed.
Expected results:
There should not be any OOM_kills
Additional info:
sosreports will be attached.
--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-10-04
03:01:08 EDT ---
This bug is automatically being proposed for the current release of Red Hat
Gluster Storage 3 under active development, by setting the release flag
'rhgs‑3.2.0' to '?'.
If this bug should be proposed for a different release, please manually change
the proposed release flag.
--- Additional comment from Shashank Raj on 2016-10-04 06:05:30 EDT ---
sosreports and logs can be accessed at
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1381452
--- Additional comment from Soumya Koduri on 2016-10-05 01:57:50 EDT ---
Shashank,
Please turn off features.cache-invalidation for that volume and re-run the
tests. If the oom_score of ganesha process is still high after the tests
complete, please tune the no. of worker threads of nfs-ganesha to 16 using
below config option and re-try the tests.
NFS_Core_Param
{
Nb_Worker = 16;
}
--- Additional comment from Shashank Raj on 2016-10-05 04:37:24 EDT ---
Tried running posix_compliance again with both features.cache-invalidation
on/off and i am not able to reproduce this issue again.
So it seems some other test in fs-sanity is the culprit for this issue.
Will keep trying it and update bug accordingly. For now changing the bug title
as appropriate.
--- Additional comment from John Skeoch on 2016-11-07 22:54:17 EST ---
User sraj at redhat.com's account has been closed
--- Additional comment from John Skeoch on 2016-11-07 22:57:23 EST ---
User sraj at redhat.com's account has been closed
--- Additional comment from Soumya Koduri on 2016-11-08 04:16:26 EST ---
(In reply to Shashank Raj from comment #4)
> Tried running posix_compliance again with both features.cache-invalidation
> on/off and i am not able to reproduce this issue again.
>
> So it seems some other test in fs-sanity is the culprit for this issue.
>
> Will keep trying it and update bug accordingly. For now changing the bug
> title as appropriate.
Surabhi,
Could you please check the same and update the bug with the details of the test
which may be causing this issue.
--- Additional comment from Arthy Loganathan on 2016-11-17 09:59:22 EST ---
For 6X2 volume, while executing posix_compliance tests, ganesha gets oom_killed
on the mounted node always when the oom_score reaches to ~870.
sosreports are at, http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1381452/
--- Additional comment from Soumya Koduri on 2016-11-17 10:01:58 EST ---
Does that mean the issue is not seen with lesser no. of replica bricks (for
eg., 2*2 volume configuration)?
--- Additional comment from Arthy Loganathan on 2016-11-17 22:37:04 EST ---
I have tried with volumes with lesser bricks like plain distribute with 2
bricks and 1*2 volume, and issue is not seen.
As Jiffin suggested, I have executed the following test with 6*2 volume
prove -vf /opt/qa/tools/posix-testsuite/tests/rename/00.t
and oom_score increases drastically when this test is running.
dmesg:
[248560.640500] Call Trace:
[248560.640511] [<ffffffff81685eac>] dump_stack+0x19/0x1b
[248560.640516] [<ffffffff81680e57>] dump_header+0x8e/0x225
[248560.640523] [<ffffffff812ae71b>] ? cred_has_capability+0x6b/0x120
[248560.640530] [<ffffffff8113cb03>] ? delayacct_end+0x33/0xb0
[248560.640537] [<ffffffff8118460e>] oom_kill_process+0x24e/0x3c0
[248560.640542] [<ffffffff810936ce>] ? has_capability_noaudit+0x1e/0x30
[248560.640545] [<ffffffff81184e46>] out_of_memory+0x4b6/0x4f0
[248560.640548] [<ffffffff81681960>] __alloc_pages_slowpath+0x5d7/0x725
[248560.640552] [<ffffffff8118af55>] __alloc_pages_nodemask+0x405/0x420
[248560.640556] [<ffffffff811cf10a>] alloc_pages_current+0xaa/0x170
[248560.640563] [<ffffffff8106a587>] pte_alloc_one+0x17/0x40
[248560.640568] [<ffffffff811adb23>] __pte_alloc+0x23/0x170
[248560.640571] [<ffffffff811b1535>] handle_mm_fault+0xe25/0xfe0
[248560.640574] [<ffffffff811b76d5>] ? do_mmap_pgoff+0x305/0x3c0
[248560.640579] [<ffffffff81691994>] __do_page_fault+0x154/0x450
[248560.640581] [<ffffffff81691cc5>] do_page_fault+0x35/0x90
[248560.640584] [<ffffffff8168df88>] page_fault+0x28/0x30
[248560.640586] Mem-Info:
[248560.640591] active_anon:1620957 inactive_anon:292997 isolated_anon:0
active_file:0 inactive_file:974 isolated_file:0
unevictable:6562 dirty:0 writeback:0 unstable:0
slab_reclaimable:7116 slab_unreclaimable:13556
mapped:5683 shmem:8641 pagetables:7532 bounce:0
free:25150 free_pcp:474 free_cma:0
[248560.640595] Node 0 DMA free:15852kB min:132kB low:164kB high:196kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15936kB
managed:15852kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB
unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[248560.640602] lowmem_reserve[]: 0 3327 7805 7805
[248560.640605] Node 0 DMA32 free:46200kB min:28752kB low:35940kB high:43128kB
active_anon:2727832kB inactive_anon:545796kB active_file:0kB
inactive_file:2536kB unevictable:16040kB isolated(anon):0kB isolated(file):0kB
present:3653620kB managed:3408880kB mlocked:16040kB dirty:0kB writeback:0kB
mapped:16596kB shmem:12260kB slab_reclaimable:9708kB slab_unreclaimable:24740kB
kernel_stack:4944kB pagetables:11448kB unstable:0kB bounce:0kB free_pcp:792kB
local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:285
all_unreclaimable? yes
[248560.640611] lowmem_reserve[]: 0 0 4478 4478
[248560.640613] Node 0 Normal free:38548kB min:38696kB low:48368kB high:58044kB
active_anon:3755996kB inactive_anon:626192kB active_file:0kB
inactive_file:1360kB unevictable:10208kB isolated(anon):0kB isolated(file):0kB
present:4718592kB managed:4585756kB mlocked:10208kB dirty:0kB writeback:0kB
mapped:6136kB shmem:22304kB slab_reclaimable:18756kB slab_unreclaimable:29484kB
kernel_stack:7872kB pagetables:18680kB unstable:0kB bounce:0kB free_pcp:1104kB
local_pcp:160kB free_cma:0kB writeback_tmp:0kB pages_scanned:1049
all_unreclaimable? yes
[248560.640618] lowmem_reserve[]: 0 0 0 0
[248560.640620] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 1*64kB (U)
1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) =
15852kB
[248560.640630] Node 0 DMA32: 1564*4kB (UE) 956*8kB (UE) 723*16kB (UEM)
388*32kB (UEM) 120*64kB (UEM) 5*128kB (EM) 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 46208kB
[248560.640639] Node 0 Normal: 1174*4kB (UEM) 1024*8kB (UEM) 691*16kB (UEM)
305*32kB (UEM) 53*64kB (UEM) 9*128kB (M) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB
0*4096kB = 38504kB
[248560.640649] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0
hugepages_size=2048kB
[248560.640651] 18106 total pagecache pages
[248560.640653] 6146 pages in swap cache
[248560.640654] Swap cache stats: add 1107386, delete 1101240, find
294696/305552
[248560.640655] Free swap = 0kB
[248560.640656] Total swap = 2097148kB
[248560.640657] 2097037 pages RAM
[248560.640658] 0 pages HighMem/MovableOnly
[248560.640659] 94415 pages reserved
[248560.640660] [ pid ] uid tgid total_vm rss nr_ptes swapents
oom_score_adj name
[248560.640666] [ 685] 0 685 17664 2179 39 49
0 systemd-journal
[248560.640668] [ 716] 0 716 220817 676 46 1476
0 lvmetad
[248560.640671] [ 722] 0 722 11679 635 22 546
-1000 systemd-udevd
[248560.640675] [ 881] 0 881 179084 6113 49 0
-1000 dmeventd
[248560.640685] [ 1273] 0 1273 13854 234 26 89
-1000 auditd
[248560.640688] [ 1292] 0 1292 4826 217 14 37
0 irqbalance
[248560.640690] [ 1293] 81 1293 8197 262 17 71
-900 dbus-daemon
[248560.640692] [ 1296] 0 1296 6156 261 15 138
0 systemd-logind
[248560.640695] [ 1299] 998 1299 132067 351 54 1894
0 polkitd
[248560.640697] [ 1310] 997 1310 28962 310 26 42
0 chronyd
[248560.640699] [ 1311] 32 1311 16237 175 34 104
0 rpcbind
[248560.640701] [ 1322] 0 1322 50303 142 40 114
0 gssproxy
[248560.640704] [ 1334] 0 1334 82865 469 84 5904
0 firewalld
[248560.640706] [ 1691] 0 1691 28206 115 52 3081
0 dhclient
[248560.640708] [ 1785] 0 1785 28335 98 12 37
0 rhsmcertd
[248560.640710] [ 1787] 0 1787 138291 385 87 2567
0 tuned
[248560.640712] [ 1798] 0 1798 20617 91 42 190
-1000 sshd
[248560.640715] [ 1916] 0 1916 22244 222 42 238
0 master
[248560.640717] [ 1918] 89 1918 22287 245 44 236
0 qmgr
[248560.640719] [ 2334] 0 2334 31556 209 17 133
0 crond
[248560.640721] [ 2385] 0 2385 26978 101 8 37
0 rhnsd
[248560.640723] [ 2388] 0 2388 27509 164 10 33
0 agetty
[248560.640726] [17375] 29 17375 10605 230 24 177
0 rpc.statd
[248560.640728] [16763] 0 16763 72838 1270 59 105
0 rsyslogd
[248560.640730] [16951] 0 16951 151619 470 86 12040
0 glusterd
[248560.640733] [27747] 0 27747 428530 2595 125 10071
0 glusterfsd
[248560.640735] [27962] 0 27962 226969 5025 89 6433
0 glusterfs
[248560.640737] [29536] 0 29536 49589 2611 63 2017
0 corosync
[248560.640739] [29552] 0 29552 33157 377 64 1026
0 pacemakerd
[248560.640741] [29554] 189 29554 35595 2224 72 1416
0 cib
[248560.640744] [29555] 0 29555 34361 885 69 479
0 stonithd
[248560.640746] [29556] 0 29556 26273 371 52 228
0 lrmd
[248560.640748] [29557] 189 29557 31731 940 64 345
0 attrd
[248560.640750] [29558] 189 29558 38963 2038 71 241
0 pengine
[248560.640752] [29559] 189 29559 47014 2147 79 880
0 crmd
[248560.640754] [29577] 0 29577 244360 8064 98 2064
0 pcsd
[248560.640757] [ 6278] 0 6278 3262386 1857506 4856 406101
0 ganesha.nfsd
[248560.640759] [22343] 0 22343 35726 306 72 290
0 sshd
[248560.640761] [22358] 0 22358 28879 278 14 48
0 bash
[248560.640764] [27763] 0 27763 330732 1777 113 8853
0 glusterfsd
[248560.640767] [27785] 0 27785 330733 2281 115 9062
0 glusterfsd
[248560.640769] [27804] 0 27804 314090 3314 110 8708
0 glusterfsd
[248560.640771] [27836] 0 27836 255249 6445 106 14561
0 glusterfs
[248560.640773] [22453] 89 22453 22270 479 42 0
0 pickup
[248560.640776] [ 5088] 0 5088 35726 635 71 0
0 sshd
[248560.640778] [ 5111] 0 5111 28879 319 15 0
0 bash
[248560.640780] [11672] 0 11672 26984 136 10 0
0 tail
[248560.640782] [14710] 0 14710 35726 581 72 0
0 sshd
[248560.640784] [14745] 0 14745 28879 311 14 0
0 bash
[248560.640787] [15813] 0 15813 28910 333 14 0
0 ganesha_grace
[248560.640789] [15819] 0 15819 28910 180 10 0
0 ganesha_grace
[248560.640791] [15820] 0 15820 30197 552 62 0
0 crm_attribute
[248560.640793] [15821] 0 15821 28877 274 14 0
0 portblock
[248560.640795] [15824] 0 15824 28811 185 14 0
0 ganesha_mon
[248560.640797] [15825] 0 15825 28877 125 10 0
0 portblock
[248560.640800] [15826] 0 15826 28811 98 11 0
0 ganesha_mon
[248560.640801] [15827] 0 15827 26974 127 10 0
0 basename
[248560.640803] Out of memory: Kill process 6278 (ganesha.nfsd) score 870 or
sacrifice child
[248560.640886] Killed process 6278 (ganesha.nfsd) total-vm:13049544kB,
anon-rss:7430024kB, file-rss:0kB, shmem-rss:0kB
--- Additional comment from Worker Ant on 2016-11-21 09:07:18 EST ---
REVIEW: http://review.gluster.org/15894 (dht/rename : Incase of failure remove
linkto file properly) posted (#1) for review on master by jiffin tony Thottan
(jthottan at redhat.com)
--- Additional comment from Worker Ant on 2016-12-01 00:31:55 EST ---
REVIEW: http://review.gluster.org/15894 (dht/rename : Incase of failure remove
linkto file properly) posted (#2) for review on master by jiffin tony Thottan
(jthottan at redhat.com)
--- Additional comment from Jiffin on 2016-12-01 00:35:05 EST ---
The above issue happens when rename/00.t test executed on nfs-ganesha clients :
Steps executed in that script
* create a file using root
* rename the file using a non root user, it fails with EACESS
* delete the file
* create directory directory using root
* rename the directory using non root user, test hungs and slowly led to OOM
kill of ganesha
RCA put forwarded by Du for OOM kill of ganesha
Note that when we hit this bug, we've a scenario of a dentry being present as:
* a linkto file on one subvol
* a directory on rest of subvols
When a lookup happens on the dentry in such a scenario, the control flow goes
into an infinite loop of:
dht_lookup_everywhere
dht_lookup_everywhere_cbk
dht_lookup_unlink_cbk
dht_lookup_everywhere_done
dht_lookup_directory (as local->dir_count > 0)
dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto
file on one of the subvol)
dht_lookup_everywhere (as need_selfheal = 1).
This infinite loop can cause increased consumption of memory due to:
1) dht_lookup_directory assigns a new layout to local->layout unconditionally
2) Most of the functions in this loop do a stack_wind of various fops.
This results in growing of call stack (note that call-stack is destroyed only
after lookup response is received by fuse - which never happens in this case)
--- Additional comment from Worker Ant on 2016-12-01 01:10:08 EST ---
REVIEW: http://review.gluster.org/15894 (dht/rename : Incase of failure remove
linkto file properly) posted (#3) for review on master by jiffin tony Thottan
(jthottan at redhat.com)
--- Additional comment from Worker Ant on 2016-12-01 10:47:38 EST ---
COMMIT: http://review.gluster.org/15894 committed in master by Vijay Bellur
(vbellur at redhat.com)
------
commit 57d59f4be205ae0c7888758366dc0049bdcfe449
Author: Jiffin Tony Thottan <jthottan at redhat.com>
Date: Mon Nov 21 18:08:14 2016 +0530
dht/rename : Incase of failure remove linkto file properly
Generally linkto file is created using root user. Consider following
case, a user is trying to rename a file which he is not permitted.
So the rename fails with EACESS and when rename tries to cleanup the
linkto file, it fails.
The above issue happens when rename/00.t test executed on nfs-ganesha
clients :
Steps executed in script
* create a file "abc" using root
* rename the file "abc" to "xyz" using a non root user, it fails with
EACESS
* delete "abc"
* create directory "abc" using root
* again try ot rename "abc" to "xyz" using non root user, test hungs here
which slowly leds to OOM kill of ganesha process
RCA put forwarded by Du for OOM kill of ganesha
Note that when we hit this bug, we've a scenario of a dentry being
present as:
* a linkto file on one subvol
* a directory on rest of subvols
When a lookup happens on the dentry in such a scenario, the control flow
goes into an infinite loop of:
dht_lookup_everywhere
dht_lookup_everywhere_cbk
dht_lookup_unlink_cbk
dht_lookup_everywhere_done
dht_lookup_directory (as local->dir_count > 0)
dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a
linkto file on one of the subvol)
dht_lookup_everywhere (as need_selfheal = 1).
This infinite loop can cause increased consumption of memory due to:
1) dht_lookup_directory assigns a new layout to local->layout
unconditionally
2) Most of the functions in this loop do a stack_wind of various fops.
This results in growing of call stack (note that call-stack is destroyed
only after lookup response is
received by fuse - which never happens in this case)
Thanks Du for root causing the oom kill and Sushant for suggesting the fix
Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d
BUG: 1397052
Signed-off-by: Jiffin Tony Thottan <jthottan at redhat.com>
Reviewed-on: http://review.gluster.org/15894
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Smoke: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
--- Additional comment from Jiffin on 2016-12-02 09:37:08 EST ---
Patch got merged upstream master
Downstream patch link
https://code.engineering.redhat.com/gerrit/#/c/92002/1
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1381452
[Bug 1381452] OOM kill of nfs-ganesha on one node while fs-sanity test
suite is executed.
https://bugzilla.redhat.com/show_bug.cgi?id=1397052
[Bug 1397052] OOM kill of nfs-ganesha on one node while fs-sanity test
suite is executed.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list