[Bugs] [Bug 1224177] New: Disperse volume: Huge memory leak of glusterfsd process

bugzilla at redhat.com bugzilla at redhat.com
Fri May 22 10:04:30 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1224177

            Bug ID: 1224177
           Summary: Disperse volume: Huge memory leak of glusterfsd
                    process
           Product: Red Hat Gluster Storage
           Version: 3.1
         Component: glusterfs
     Sub Component: disperse
          Keywords: Triaged
          Severity: urgent
          Priority: high
          Assignee: rhs-bugs at redhat.com
          Reporter: byarlaga at redhat.com
        QA Contact: byarlaga at redhat.com
                CC: atalur at redhat.com, bugs at gluster.org,
                    byarlaga at redhat.com, gluster-bugs at redhat.com,
                    nsathyan at redhat.com, pkarampu at redhat.com,
                    rkavunga at redhat.com, xhernandez at datalab.es
        Depends On: 1207735
            Blocks: 1186580 (qe_tracker_everglades)
             Group: redhat



+++ This bug was initially created as a clone of Bug #1207735 +++

Description of problem:
=======================
There's a huge memory leak in glusterfsd process with disperse volume. Created
a plain disperse volume and converted to distributed-disperse. There's no IO
from the clients but seeing the resident memory reaching upto 20GB as seen from
top command for the glusterfsd process and the system becomes unresponsive as
the whole memory gets consumed.

Version-Release number of selected component (if applicable):
=============================================================
[root at vertigo geo-master]# gluster --version
glusterfs 3.7dev built on Mar 31 2015 01:05:54
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.

Additional info:
================
Top output of node1:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 9902 root      20   0 4321m 1.4g 2920 D 20.0  4.5   1:28.68 glusterfsd         
10758 root      20   0 4321m 1.4g 2920 D 18.4  4.5   1:26.33 glusterfsd         
10053 root      20   0 4961m 1.6g 2920 D 18.1  5.2   1:28.64 glusterfsd         
10729 root      20   0 3681m 1.0g 2920 D 17.1  3.3   1:26.60 glusterfsd         
10759 root      20   0 4321m 1.4g 2920 S 17.1  4.5   1:25.68 glusterfsd         
10756 root      20   0 3745m 1.4g 2920 S 16.4  4.6   1:30.05 glusterfsd         
 9939 root      20   0 4321m 1.4g 2920 S 16.4  4.5   1:27.61 glusterfsd         
10775 root      20   0 4961m 1.6g 2920 D 15.8  5.2   1:26.52 glusterfsd         
10723 root      20   0 3745m 1.4g 2920 S 15.8  4.6   1:32.41 glusterfsd         
10728 root      20   0 34.0g  19g 2920 S 15.8 63.3   1:31.89 glusterfsd         
10054 root      20   0 3681m 1.0g 2920 D 15.8  3.3   1:28.10 glusterfsd         
10090 root      20   0 3681m 1.0g 2920 S 15.8  3.3   1:33.02 glusterfsd         
10789 root      20   0 3681m 1.0g 2920 D 15.8  3.3   1:26.16 glusterfsd         
10739 root      20   0 4961m 1.6g 2920 D 15.4  5.2   1:31.29 glusterfsd         
10763 root      20   0 4961m 1.6g 2920 S 15.4  5.2   1:27.03 glusterfsd         
10727 root      20   0 34.0g  19g 2920 S 15.4 63.3   1:31.35 glusterfsd         
10782 root      20   0 34.0g  19g 2920 S 15.4 63.3   1:31.86 glusterfsd         
10062 root      20   0 3425m 1.1g 2920 S 15.4  3.5   1:44.85 glusterfsd         
10783 root      20   0 3681m 1.0g 2920 D 15.4  3.3   1:26.73 glusterfsd         
 9940 root      20   0 4321m 1.4g 2920 S 15.4  4.5   1:28.84 glusterfsd         
10724 root      20   0 4321m 1.4g 2920 D 15.4  4.5   1:25.27 glusterfsd         
10753 root      20   0 4321m 1.4g 2920 S 15.4  4.5   1:26.44 glusterfsd         
10733 root      20   0 3745m 1.4g 2920 R 15.1  4.6   1:28.42 glusterfsd         
10755 root      20   0 3745m 1.4g 2920 S 15.1  4.6   1:31.19 glusterfsd         
10091 root      20   0 34.0g  19g 2920 S 15.1 63.3   1:33.56 glusterfsd         
10778 root      20   0 34.0g  19g 2920 S 15.1 63.3   1:31.88 glusterfsd         
 9894 root      20   0 3681m 1.0g 2920 D 15.1  3.3   1:32.51 glusterfsd         
10736 root      20   0 3681m 1.0g 2920 S 15.1  3.3   1:27.33 glusterfsd         
10746 root      20   0 4321m 1.4g 2920 D 15.1  4.5   1:25.14 glusterfsd         
10744 root      20   0 4961m 1.6g 2920 S 14.8  5.2   1:29.22 glusterfsd         
10743 root      20   0 3745m 1.4g 2920 S 14.8  4.6   1:29.96 glusterfsd         
10784 root      20   0 34.0g  19g 2920 S 14.8 63.3   1:31.92 glusterfsd         
 9735 root      20   0 4961m 1.6g 2920 S 14.4  5.2   1:28.84 glusterfsd         
 9903 root      20   0 4961m 1.6g 2920 S 14.4  5.2   1:28.63 glusterfsd    

Attaching the statedumps of the volumes.

--- Additional comment from Bhaskarakiran on 2015-03-31 11:18:11 EDT ---



--- Additional comment from Bhaskarakiran on 2015-05-05 01:17:36 EDT ---

On recent builds, seeing bricks and nfs servers getting crashed with OOM
messages. sequence of events that happen are :

1. client mount hangs
2. brick crashes
3. export of volume is not shown with rpcinfo
4. nfs server crashes with OOM.

--- Additional comment from Xavier Hernandez on 2015-05-06 12:03:14 EDT ---

I've tried to reproduce this issue with current master and I've been unable.

Do you do anything else besides the add-brick and rebalance ?

--- Additional comment from Bhaskarakiran on 2015-05-13 06:14:49 EDT ---

Even with the plain disperse volume and nfs mount the issue persists on 3.7
beta2 build. NFS mounted the volume and ran iozone -a couple of times and
seeing the leak. The process is taking almost 40g.

14314 root      20   0 17.1g 8.0g 2528 S 20.0 12.7  41:15.49 glusterfsd         
14396 root      20   0 17.1g 8.0g 2528 S 19.4 12.7  42:16.27 glusterfsd         
14397 root      20   0 17.1g 8.0g 2528 S 19.4 12.7  43:34.59 glusterfsd         
14721 root      20   0 17.1g 8.0g 2528 S 19.4 12.7  43:08.11 glusterfsd         
14697 root      20   0 17.1g 8.0g 2528 S 19.0 12.7  41:04.22 glusterfsd         
14702 root      20   0 17.1g 8.0g 2528 S 19.0 12.7  41:13.08 glusterfsd         
14722 root      20   0 17.1g 8.0g 2528 S 19.0 12.7  40:32.11 glusterfsd         
14713 root      20   0 65.3g  40g 2528 S 18.7 64.5  40:38.43 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                          
14735 root      20   0 65.3g  40g 2528 S 18.7 64.5  41:52.18 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                          
14392 root      20   0 17.1g 8.0g 2528 S 18.7 12.7  43:33.64 glusterfsd         
14704 root      20   0 17.1g 8.0g 2528 S 18.7 12.7  41:59.24 glusterfsd         
14714 root      20   0 65.3g  40g 2528 S 18.4 64.5  39:08.16 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14737 root      20   0 65.3g  40g 2528 S 18.4 64.5  41:03.79 glusterfsd         
14701 root      20   0 17.1g 8.0g 2528 S 18.4 12.7  41:18.25 glusterfsd         
14684 root      20   0 10.3g 4.4g 2532 S 18.4  7.0  38:15.19 glusterfsd         
14388 root      20   0 65.3g  40g 2528 S 18.1 64.5  40:20.30 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14716 root      20   0 65.3g  40g 2528 R 18.1 64.5  40:24.51 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14736 root      20   0 65.3g  40g 2528 R 18.1 64.5  38:40.43 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14703 root      20   0 17.1g 8.0g 2528 S 18.1 12.7  41:06.25 glusterfsd         
14331 root      20   0 10.3g 4.4g 2532 S 18.1  7.0  38:29.85 glusterfsd         
14294 root      20   0 65.3g  40g 2528 R 17.7 64.5  38:03.70 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14395 root      20   0 65.3g  40g 2528 R 17.7 64.5  38:51.38 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14705 root      20   0 17.1g 8.0g 2528 S 17.7 12.7  43:05.49 glusterfsd         
14723 root      20   0 17.1g 8.0g 2528 R 17.7 12.7  42:20.05 glusterfsd         
14740 root      20   0 17.1g 8.0g 2528 S 17.7 12.7  39:55.02 glusterfsd         
14389 root      20   0 10.3g 4.4g 2532 S 17.7  7.0  39:52.06 glusterfsd         
14675 root      20   0 10.3g 4.4g 2532 S 17.7  7.0  38:26.46 glusterfsd         
14678 root      20   0 65.3g  40g 2528 S 17.4 64.5  40:18.39 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14734 root      20   0 65.3g  40g 2528 S 17.4 64.5  39:07.99 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14328 root      20   0 10.3g 4.4g 2532 S 17.4  7.0  38:01.29 glusterfsd         
14393 root      20   0 10.3g 4.4g 2532 S 17.4  7.0  39:14.94 glusterfsd         
14683 root      20   0 10.3g 4.4g 2532 S 17.4  7.0  38:10.70 glusterfsd         
14696 root      20   0 65.3g  40g 2528 S 17.1 64.5  39:26.60 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14390 root      20   0 17.1g 8.0g 2528 S 17.1 12.7  41:03.34 glusterfsd         
14724 root      20   0 17.1g 8.0g 2528 S 17.1 12.7  41:06.26 glusterfsd         
14329 root      20   0 10.3g 4.4g 2532 S 17.1  7.0  38:46.04 glusterfsd         
14712 root      20   0 10.3g 4.4g 2532 S 17.1  7.0  38:18.10 glusterfsd         
14297 root      20   0 65.3g  40g 2528 S 16.7 64.5  40:29.80 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                           
14670 root      20   0 65.3g  40g 2528 S 16.7 64.5  39:24.16 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14700 root      20   0 65.3g  40g 2528 R 16.7 64.5  40:00.28 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14715 root      20   0 65.3g  40g 2528 S 16.7 64.5  40:53.39 glusterfsd 
>>>>>>>>>>>>>>>>>>>>>>>                            
14311 root      20   0 17.1g 8.0g 2528 S 16.7 12.7  39:05.23 glusterfsd         
14706 root      20   0 10.3g 4.4g 2532 S 16.7  7.0  37:28.30 glusterfsd         
14707 root      20   0 10.3g 4.4g 2532 S 16.7  7.0  37:52.83 glusterfsd

--- Additional comment from Xavier Hernandez on 2015-05-14 03:18:35 EDT ---

Thanks, I'll try again with NFS and iozone.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1207735
[Bug 1207735] Disperse volume: Huge memory leak of glusterfsd process
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=xTKaMLv2bO&a=cc_unsubscribe


More information about the Bugs mailing list