[Bugs] [Bug 1229282] New: Disperse volume: Huge memory leak of glusterfsd process
bugzilla at redhat.com
bugzilla at redhat.com
Mon Jun 8 11:12:17 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1229282
Bug ID: 1229282
Summary: Disperse volume: Huge memory leak of glusterfsd
process
Product: GlusterFS
Version: 3.7.0
Component: quota
Keywords: Triaged
Severity: urgent
Priority: high
Assignee: bugs at gluster.org
Reporter: vmallika at redhat.com
CC: bugs at gluster.org, byarlaga at redhat.com,
gluster-bugs at redhat.com, nsathyan at redhat.com,
pkarampu at redhat.com, rkavunga at redhat.com,
vmallika at redhat.com, xhernandez at datalab.es
Depends On: 1207735
Blocks: 1186580 (qe_tracker_everglades), 1224177
+++ This bug was initially created as a clone of Bug #1207735 +++
Description of problem:
=======================
There's a huge memory leak in glusterfsd process with disperse volume. Created
a plain disperse volume and converted to distributed-disperse. There's no IO
from the clients but seeing the resident memory reaching upto 20GB as seen from
top command for the glusterfsd process and the system becomes unresponsive as
the whole memory gets consumed.
Version-Release number of selected component (if applicable):
=============================================================
[root at vertigo geo-master]# gluster --version
glusterfs 3.7dev built on Mar 31 2015 01:05:54
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
Additional info:
================
Top output of node1:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9902 root 20 0 4321m 1.4g 2920 D 20.0 4.5 1:28.68 glusterfsd
10758 root 20 0 4321m 1.4g 2920 D 18.4 4.5 1:26.33 glusterfsd
10053 root 20 0 4961m 1.6g 2920 D 18.1 5.2 1:28.64 glusterfsd
10729 root 20 0 3681m 1.0g 2920 D 17.1 3.3 1:26.60 glusterfsd
10759 root 20 0 4321m 1.4g 2920 S 17.1 4.5 1:25.68 glusterfsd
10756 root 20 0 3745m 1.4g 2920 S 16.4 4.6 1:30.05 glusterfsd
9939 root 20 0 4321m 1.4g 2920 S 16.4 4.5 1:27.61 glusterfsd
10775 root 20 0 4961m 1.6g 2920 D 15.8 5.2 1:26.52 glusterfsd
10723 root 20 0 3745m 1.4g 2920 S 15.8 4.6 1:32.41 glusterfsd
10728 root 20 0 34.0g 19g 2920 S 15.8 63.3 1:31.89 glusterfsd
10054 root 20 0 3681m 1.0g 2920 D 15.8 3.3 1:28.10 glusterfsd
10090 root 20 0 3681m 1.0g 2920 S 15.8 3.3 1:33.02 glusterfsd
10789 root 20 0 3681m 1.0g 2920 D 15.8 3.3 1:26.16 glusterfsd
10739 root 20 0 4961m 1.6g 2920 D 15.4 5.2 1:31.29 glusterfsd
10763 root 20 0 4961m 1.6g 2920 S 15.4 5.2 1:27.03 glusterfsd
10727 root 20 0 34.0g 19g 2920 S 15.4 63.3 1:31.35 glusterfsd
10782 root 20 0 34.0g 19g 2920 S 15.4 63.3 1:31.86 glusterfsd
10062 root 20 0 3425m 1.1g 2920 S 15.4 3.5 1:44.85 glusterfsd
10783 root 20 0 3681m 1.0g 2920 D 15.4 3.3 1:26.73 glusterfsd
9940 root 20 0 4321m 1.4g 2920 S 15.4 4.5 1:28.84 glusterfsd
10724 root 20 0 4321m 1.4g 2920 D 15.4 4.5 1:25.27 glusterfsd
10753 root 20 0 4321m 1.4g 2920 S 15.4 4.5 1:26.44 glusterfsd
10733 root 20 0 3745m 1.4g 2920 R 15.1 4.6 1:28.42 glusterfsd
10755 root 20 0 3745m 1.4g 2920 S 15.1 4.6 1:31.19 glusterfsd
10091 root 20 0 34.0g 19g 2920 S 15.1 63.3 1:33.56 glusterfsd
10778 root 20 0 34.0g 19g 2920 S 15.1 63.3 1:31.88 glusterfsd
9894 root 20 0 3681m 1.0g 2920 D 15.1 3.3 1:32.51 glusterfsd
10736 root 20 0 3681m 1.0g 2920 S 15.1 3.3 1:27.33 glusterfsd
10746 root 20 0 4321m 1.4g 2920 D 15.1 4.5 1:25.14 glusterfsd
10744 root 20 0 4961m 1.6g 2920 S 14.8 5.2 1:29.22 glusterfsd
10743 root 20 0 3745m 1.4g 2920 S 14.8 4.6 1:29.96 glusterfsd
10784 root 20 0 34.0g 19g 2920 S 14.8 63.3 1:31.92 glusterfsd
9735 root 20 0 4961m 1.6g 2920 S 14.4 5.2 1:28.84 glusterfsd
9903 root 20 0 4961m 1.6g 2920 S 14.4 5.2 1:28.63 glusterfsd
Attaching the statedumps of the volumes.
--- Additional comment from Bhaskarakiran on 2015-03-31 11:18:11 EDT ---
--- Additional comment from Bhaskarakiran on 2015-05-05 01:17:36 EDT ---
On recent builds, seeing bricks and nfs servers getting crashed with OOM
messages. sequence of events that happen are :
1. client mount hangs
2. brick crashes
3. export of volume is not shown with rpcinfo
4. nfs server crashes with OOM.
--- Additional comment from Xavier Hernandez on 2015-05-06 12:03:14 EDT ---
I've tried to reproduce this issue with current master and I've been unable.
Do you do anything else besides the add-brick and rebalance ?
--- Additional comment from Bhaskarakiran on 2015-05-13 06:14:49 EDT ---
Even with the plain disperse volume and nfs mount the issue persists on 3.7
beta2 build. NFS mounted the volume and ran iozone -a couple of times and
seeing the leak. The process is taking almost 40g.
14314 root 20 0 17.1g 8.0g 2528 S 20.0 12.7 41:15.49 glusterfsd
14396 root 20 0 17.1g 8.0g 2528 S 19.4 12.7 42:16.27 glusterfsd
14397 root 20 0 17.1g 8.0g 2528 S 19.4 12.7 43:34.59 glusterfsd
14721 root 20 0 17.1g 8.0g 2528 S 19.4 12.7 43:08.11 glusterfsd
14697 root 20 0 17.1g 8.0g 2528 S 19.0 12.7 41:04.22 glusterfsd
14702 root 20 0 17.1g 8.0g 2528 S 19.0 12.7 41:13.08 glusterfsd
14722 root 20 0 17.1g 8.0g 2528 S 19.0 12.7 40:32.11 glusterfsd
14713 root 20 0 65.3g 40g 2528 S 18.7 64.5 40:38.43 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14735 root 20 0 65.3g 40g 2528 S 18.7 64.5 41:52.18 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14392 root 20 0 17.1g 8.0g 2528 S 18.7 12.7 43:33.64 glusterfsd
14704 root 20 0 17.1g 8.0g 2528 S 18.7 12.7 41:59.24 glusterfsd
14714 root 20 0 65.3g 40g 2528 S 18.4 64.5 39:08.16 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14737 root 20 0 65.3g 40g 2528 S 18.4 64.5 41:03.79 glusterfsd
14701 root 20 0 17.1g 8.0g 2528 S 18.4 12.7 41:18.25 glusterfsd
14684 root 20 0 10.3g 4.4g 2532 S 18.4 7.0 38:15.19 glusterfsd
14388 root 20 0 65.3g 40g 2528 S 18.1 64.5 40:20.30 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14716 root 20 0 65.3g 40g 2528 R 18.1 64.5 40:24.51 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14736 root 20 0 65.3g 40g 2528 R 18.1 64.5 38:40.43 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14703 root 20 0 17.1g 8.0g 2528 S 18.1 12.7 41:06.25 glusterfsd
14331 root 20 0 10.3g 4.4g 2532 S 18.1 7.0 38:29.85 glusterfsd
14294 root 20 0 65.3g 40g 2528 R 17.7 64.5 38:03.70 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14395 root 20 0 65.3g 40g 2528 R 17.7 64.5 38:51.38 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14705 root 20 0 17.1g 8.0g 2528 S 17.7 12.7 43:05.49 glusterfsd
14723 root 20 0 17.1g 8.0g 2528 R 17.7 12.7 42:20.05 glusterfsd
14740 root 20 0 17.1g 8.0g 2528 S 17.7 12.7 39:55.02 glusterfsd
14389 root 20 0 10.3g 4.4g 2532 S 17.7 7.0 39:52.06 glusterfsd
14675 root 20 0 10.3g 4.4g 2532 S 17.7 7.0 38:26.46 glusterfsd
14678 root 20 0 65.3g 40g 2528 S 17.4 64.5 40:18.39 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14734 root 20 0 65.3g 40g 2528 S 17.4 64.5 39:07.99 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14328 root 20 0 10.3g 4.4g 2532 S 17.4 7.0 38:01.29 glusterfsd
14393 root 20 0 10.3g 4.4g 2532 S 17.4 7.0 39:14.94 glusterfsd
14683 root 20 0 10.3g 4.4g 2532 S 17.4 7.0 38:10.70 glusterfsd
14696 root 20 0 65.3g 40g 2528 S 17.1 64.5 39:26.60 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14390 root 20 0 17.1g 8.0g 2528 S 17.1 12.7 41:03.34 glusterfsd
14724 root 20 0 17.1g 8.0g 2528 S 17.1 12.7 41:06.26 glusterfsd
14329 root 20 0 10.3g 4.4g 2532 S 17.1 7.0 38:46.04 glusterfsd
14712 root 20 0 10.3g 4.4g 2532 S 17.1 7.0 38:18.10 glusterfsd
14297 root 20 0 65.3g 40g 2528 S 16.7 64.5 40:29.80 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14670 root 20 0 65.3g 40g 2528 S 16.7 64.5 39:24.16 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14700 root 20 0 65.3g 40g 2528 R 16.7 64.5 40:00.28 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14715 root 20 0 65.3g 40g 2528 S 16.7 64.5 40:53.39 glusterfsd
>>>>>>>>>>>>>>>>>>>>>>>
14311 root 20 0 17.1g 8.0g 2528 S 16.7 12.7 39:05.23 glusterfsd
14706 root 20 0 10.3g 4.4g 2532 S 16.7 7.0 37:28.30 glusterfsd
14707 root 20 0 10.3g 4.4g 2532 S 16.7 7.0 37:52.83 glusterfsd
--- Additional comment from Xavier Hernandez on 2015-05-14 03:18:35 EDT ---
Thanks, I'll try again with NFS and iozone.
--- Additional comment from Anuradha on 2015-05-22 07:37:03 EDT ---
Bhaskarakiran,
Do you have sos-reports corresponding to the statedump attached? I need to go
through logs to understand the state of the system.
--- Additional comment from Anand Avati on 2015-06-02 07:24:57 EDT ---
REVIEW: http://review.gluster.org/11044 (fd: Do fd_bind on successful open)
posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Pranith Kumar K on 2015-06-02 07:25:46 EDT ---
This patch only fixes wrong fd_count being shown in statedump because fd_binds
were not happening. Still looking into more fd leaks.
--- Additional comment from Anand Avati on 2015-06-02 07:50:04 EDT ---
REVIEW: http://review.gluster.org/11044 (fd: Do fd_bind on successful open)
posted (#2) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-06-02 08:29:01 EDT ---
REVIEW: http://review.gluster.org/11045 (features/quota: Fix ref-leak) posted
(#1) for review on master by Pranith Kumar Karampuri (pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-06-04 00:42:30 EDT ---
COMMIT: http://review.gluster.org/11045 committed in master by Raghavendra G
(rgowdapp at redhat.com)
------
commit 2b7ae84a5feb636f0e41d0ab36c04b7f3fbce520
Author: Pranith Kumar K <pkarampu at redhat.com>
Date: Tue Jun 2 17:58:00 2015 +0530
features/quota: Fix ref-leak
Change-Id: I0b44b70f07be441e044d9dfc5c2b64bd5b4cac18
BUG: 1207735
Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
Reviewed-on: http://review.gluster.org/11045
Tested-by: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
Tested-by: Raghavendra G <rgowdapp at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1207735
[Bug 1207735] Disperse volume: Huge memory leak of glusterfsd process
https://bugzilla.redhat.com/show_bug.cgi?id=1224177
[Bug 1224177] Disperse volume: Huge memory leak of glusterfsd process
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list