[Bugs] [Bug 1229282] New: Disperse volume: Huge memory leak of glusterfsd process

bugzilla at redhat.com bugzilla at redhat.com
Mon Jun 8 11:12:17 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1229282

            Bug ID: 1229282
           Summary: Disperse volume: Huge memory leak of glusterfsd
                    process
           Product: GlusterFS
           Version: 3.7.0
         Component: quota
          Keywords: Triaged
          Severity: urgent
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: vmallika at redhat.com
                CC: bugs at gluster.org, byarlaga at redhat.com,
                    gluster-bugs at redhat.com, nsathyan at redhat.com,
                    pkarampu at redhat.com, rkavunga at redhat.com,
                    vmallika at redhat.com, xhernandez at datalab.es
        Depends On: 1207735
            Blocks: 1186580 (qe_tracker_everglades), 1224177



+++ This bug was initially created as a clone of Bug #1207735 +++

Description of problem:
=======================
There's a huge memory leak in glusterfsd process with disperse volume. Created
a plain disperse volume and converted to distributed-disperse. There's no IO
from the clients but seeing the resident memory reaching upto 20GB as seen from
top command for the glusterfsd process and the system becomes unresponsive as
the whole memory gets consumed.

Version-Release number of selected component (if applicable):
=============================================================
[root at vertigo geo-master]# gluster --version
glusterfs 3.7dev built on Mar 31 2015 01:05:54
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.

Additional info:
================
Top output of node1:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 9902 root      20   0 4321m 1.4g 2920 D 20.0  4.5   1:28.68 glusterfsd         
10758 root      20   0 4321m 1.4g 2920 D 18.4  4.5   1:26.33 glusterfsd         
10053 root      20   0 4961m 1.6g 2920 D 18.1  5.2   1:28.64 glusterfsd         
10729 root      20   0 3681m 1.0g 2920 D 17.1  3.3   1:26.60 glusterfsd         
10759 root      20   0 4321m 1.4g 2920 S 17.1  4.5   1:25.68 glusterfsd         
10756 root      20   0 3745m 1.4g 2920 S 16.4  4.6   1:30.05 glusterfsd         
 9939 root      20   0 4321m 1.4g 2920 S 16.4  4.5   1:27.61 glusterfsd         
10775 root      20   0 4961m 1.6g 2920 D 15.8  5.2   1:26.52 glusterfsd         
10723 root      20   0 3745m 1.4g 2920 S 15.8  4.6   1:32.41 glusterfsd         
10728 root      20   0 34.0g  19g 2920 S 15.8 63.3   1:31.89 glusterfsd         
10054 root      20   0 3681m 1.0g 2920 D 15.8  3.3   1:28.10 glusterfsd         
10090 root      20   0 3681m 1.0g 2920 S 15.8  3.3   1:33.02 glusterfsd         
10789 root      20   0 3681m 1.0g 2920 D 15.8  3.3   1:26.16 glusterfsd         
10739 root      20   0 4961m 1.6g 2920 D 15.4  5.2   1:31.29 glusterfsd         
10763 root      20   0 4961m 1.6g 2920 S 15.4  5.2   1:27.03 glusterfsd         
10727 root      20   0 34.0g  19g 2920 S 15.4 63.3   1:31.35 glusterfsd         
10782 root      20   0 34.0g  19g 2920 S 15.4 63.3   1:31.86 glusterfsd         
10062 root      20   0 3425m 1.1g 2920 S 15.4  3.5   1:44.85 glusterfsd         
10783 root      20   0 3681m 1.0g 2920 D 15.4  3.3   1:26.73 glusterfsd         
 9940 root      20   0 4321m 1.4g 2920 S 15.4  4.5   1:28.84 glusterfsd         
10724 root      20   0 4321m 1.4g 2920 D 15.4  4.5   1:25.27 glusterfsd         
10753 root      20   0 4321m 1.4g 2920 S 15.4  4.5   1:26.44 glusterfsd         
10733 root      20   0 3745m 1.4g 2920 R 15.1  4.6   1:28.42 glusterfsd         
10755 root      20   0 3745m 1.4g 2920 S 15.1  4.6   1:31.19 glusterfsd         
10091 root      20   0 34.0g  19g 2920 S 15.1 63.3   1:33.56 glusterfsd         
10778 root      20   0 34.0g  19g 2920 S 15.1 63.3   1:31.88 glusterfsd         
 9894 root      20   0 3681m 1.0g 2920 D 15.1  3.3   1:32.51 glusterfsd         
10736 root      20   0 3681m 1.0g 2920 S 15.1  3.3   1:27.33 glusterfsd         
10746 root      20   0 4321m 1.4g 2920 D 15.1  4.5   1:25.14 glusterfsd         
10744 root      20   0 4961m 1.6g 2920 S 14.8  5.2   1:29.22 glusterfsd         
10743 root      20   0 3745m 1.4g 2920 S 14.8  4.6   1:29.96 glusterfsd         
10784 root      20   0 34.0g  19g 2920 S 14.8 63.3   1:31.92 glusterfsd         
 9735 root      20   0 4961m 1.6g 2920 S 14.4  5.2   1:28.84 glusterfsd         
 9903 root      20   0 4961m 1.6g 2920 S 14.4  5.2   1:28.63 glusterfsd    

Attaching the statedumps of the volumes.

--- Additional comment from Bhaskarakiran on 2015-03-31 11:18:11 EDT ---



--- Additional comment from Bhaskarakiran on 2015-05-05 01:17:36 EDT ---

On recent builds, seeing bricks and nfs servers getting crashed with OOM
messages. sequence of events that happen are :

1. client mount hangs
2. brick crashes
3. export of volume is not shown with rpcinfo
4. nfs server crashes with OOM.

--- Additional comment from Xavier Hernandez on 2015-05-06 12:03:14 EDT ---

I've tried to reproduce this issue with current master and I've been unable.

Do you do anything else besides the add-brick and rebalance ?

--- Additional comment from Bhaskarakiran on 2015-05-13 06:14:49 EDT ---

Even with the plain disperse volume and nfs mount the issue persists on 3.7
beta2 build. NFS mounted the volume and ran iozone -a couple of times and
seeing the leak. The process is taking almost 40g.

14314 root      20   0 17.1g 8.0g 2528 S 20.0 12.7  41:15.49 glusterfsd         
14396 root      20   0 17.1g 8.0g 2528 S 19.4 12.7  42:16.27 glusterfsd         
14397 root      20   0 17.1g 8.0g 2528 S 19.4 12.7  43:34.59 glusterfsd         
14721 root      20   0 17.1g 8.0g 2528 S 19.4 12.7  43:08.11 glusterfsd         
14697 root      20   0 17.1g 8.0g 2528 S 19.0 12.7  41:04.22 glusterfsd         
14702 root      20   0 17.1g 8.0g 2528 S 19.0 12.7  41:13.08 glusterfsd         
14722 root      20   0 17.1g 8.0g 2528 S 19.0 12.7  40:32.11 glusterfsd         
14713 root      20   0 65.3g  40g 2528 S 18.7 64.5  40:38.43 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                          
14735 root      20   0 65.3g  40g 2528 S 18.7 64.5  41:52.18 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                          
14392 root      20   0 17.1g 8.0g 2528 S 18.7 12.7  43:33.64 glusterfsd         
14704 root      20   0 17.1g 8.0g 2528 S 18.7 12.7  41:59.24 glusterfsd         
14714 root      20   0 65.3g  40g 2528 S 18.4 64.5  39:08.16 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14737 root      20   0 65.3g  40g 2528 S 18.4 64.5  41:03.79 glusterfsd         
14701 root      20   0 17.1g 8.0g 2528 S 18.4 12.7  41:18.25 glusterfsd         
14684 root      20   0 10.3g 4.4g 2532 S 18.4  7.0  38:15.19 glusterfsd         
14388 root      20   0 65.3g  40g 2528 S 18.1 64.5  40:20.30 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14716 root      20   0 65.3g  40g 2528 R 18.1 64.5  40:24.51 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14736 root      20   0 65.3g  40g 2528 R 18.1 64.5  38:40.43 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14703 root      20   0 17.1g 8.0g 2528 S 18.1 12.7  41:06.25 glusterfsd         
14331 root      20   0 10.3g 4.4g 2532 S 18.1  7.0  38:29.85 glusterfsd         
14294 root      20   0 65.3g  40g 2528 R 17.7 64.5  38:03.70 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14395 root      20   0 65.3g  40g 2528 R 17.7 64.5  38:51.38 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14705 root      20   0 17.1g 8.0g 2528 S 17.7 12.7  43:05.49 glusterfsd         
14723 root      20   0 17.1g 8.0g 2528 R 17.7 12.7  42:20.05 glusterfsd         
14740 root      20   0 17.1g 8.0g 2528 S 17.7 12.7  39:55.02 glusterfsd         
14389 root      20   0 10.3g 4.4g 2532 S 17.7  7.0  39:52.06 glusterfsd         
14675 root      20   0 10.3g 4.4g 2532 S 17.7  7.0  38:26.46 glusterfsd         
14678 root      20   0 65.3g  40g 2528 S 17.4 64.5  40:18.39 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14734 root      20   0 65.3g  40g 2528 S 17.4 64.5  39:07.99 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14328 root      20   0 10.3g 4.4g 2532 S 17.4  7.0  38:01.29 glusterfsd         
14393 root      20   0 10.3g 4.4g 2532 S 17.4  7.0  39:14.94 glusterfsd         
14683 root      20   0 10.3g 4.4g 2532 S 17.4  7.0  38:10.70 glusterfsd         
14696 root      20   0 65.3g  40g 2528 S 17.1 64.5  39:26.60 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14390 root      20   0 17.1g 8.0g 2528 S 17.1 12.7  41:03.34 glusterfsd         
14724 root      20   0 17.1g 8.0g 2528 S 17.1 12.7  41:06.26 glusterfsd         
14329 root      20   0 10.3g 4.4g 2532 S 17.1  7.0  38:46.04 glusterfsd         
14712 root      20   0 10.3g 4.4g 2532 S 17.1  7.0  38:18.10 glusterfsd         
14297 root      20   0 65.3g  40g 2528 S 16.7 64.5  40:29.80 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                           
14670 root      20   0 65.3g  40g 2528 S 16.7 64.5  39:24.16 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14700 root      20   0 65.3g  40g 2528 R 16.7 64.5  40:00.28 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14715 root      20   0 65.3g  40g 2528 S 16.7 64.5  40:53.39 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14311 root      20   0 17.1g 8.0g 2528 S 16.7 12.7  39:05.23 glusterfsd         
14706 root      20   0 10.3g 4.4g 2532 S 16.7  7.0  37:28.30 glusterfsd         
14707 root      20   0 10.3g 4.4g 2532 S 16.7  7.0  37:52.83 glusterfsd

--- Additional comment from Xavier Hernandez on 2015-05-14 03:18:35 EDT ---

Thanks, I'll try again with NFS and iozone.

--- Additional comment from Anuradha on 2015-05-22 07:37:03 EDT ---

Bhaskarakiran,

Do you have sos-reports corresponding to the statedump attached? I need to go
through logs to understand the state of the system.

--- Additional comment from Anand Avati on 2015-06-02 07:24:57 EDT ---

REVIEW: http://review.gluster.org/11044 (fd: Do fd_bind on successful open)
posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Pranith Kumar K on 2015-06-02 07:25:46 EDT ---

This patch only fixes wrong fd_count being shown in statedump because fd_binds
were not happening. Still looking into more fd leaks.

--- Additional comment from Anand Avati on 2015-06-02 07:50:04 EDT ---

REVIEW: http://review.gluster.org/11044 (fd: Do fd_bind on successful open)
posted (#2) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-06-02 08:29:01 EDT ---

REVIEW: http://review.gluster.org/11045 (features/quota: Fix ref-leak) posted
(#1) for review on master by Pranith Kumar Karampuri (pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-06-04 00:42:30 EDT ---

COMMIT: http://review.gluster.org/11045 committed in master by Raghavendra G
(rgowdapp at redhat.com) 
------
commit 2b7ae84a5feb636f0e41d0ab36c04b7f3fbce520
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Tue Jun 2 17:58:00 2015 +0530

    features/quota: Fix ref-leak

    Change-Id: I0b44b70f07be441e044d9dfc5c2b64bd5b4cac18
    BUG: 1207735
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/11045
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
    Tested-by: Raghavendra G <rgowdapp at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1207735
[Bug 1207735] Disperse volume: Huge memory leak of glusterfsd process
https://bugzilla.redhat.com/show_bug.cgi?id=1224177
[Bug 1224177] Disperse volume: Huge memory leak of glusterfsd process
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list