[Bugs] [Bug 1144792] New: Very high memory usage during rebalance

bugzilla at redhat.com bugzilla at redhat.com
Sun Sep 21 02:05:54 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1144792

            Bug ID: 1144792
           Summary: Very high memory usage during rebalance
           Product: GlusterFS
           Version: 3.4.5
         Component: distribute
          Severity: medium
          Assignee: gluster-bugs at redhat.com
          Reporter: kdhananj at redhat.com
                CC: bugs at gluster.org, kdhananj at redhat.com,
                    rgowdapp at redhat.com, ryan.clough at dsic.com
        Depends On: 1142052
            Blocks: 1144640



+++ This bug was initially created as a clone of Bug #1142052 +++

Description of problem:
Rebalance has been running for about 7 days on a 2 node, 2 brick, 52TB
distributed volume. Memory usage has grown slowing over time to consume all
available physical memory. OOM killer stopped the last rebalance and then after
restarting all Gluster processes I am attempting it again.
"sync; echo 3 > /proc/sys/vm/drop_caches" has no effect to lower memory
consumption.

Version-Release number of selected component (if applicable):
[root at hgluster01 ~]# glusterfs --version
glusterfs 3.5.2 built on Jul 31 2014 18:47:52

How reproducible:
So far, I cannot complete a rebalance.

Steps to Reproduce:
1. Start rebalance
gluster volume rebalance export_volume start

Actual results:
High memory consumption by glusterfs process gets OOM'd

Expected results:
Rebalance does not consume all available memory and completes the rebalance and
fix-layout.

Additional info:
[root at hgluster01 ~]# gluster volume status export_volume detail
Status of volume: export_volume
------------------------------------------------------------------------------
Brick                : Brick hgluster01:/gluster_data
Port                 : 49152               
Online               : Y                   
Pid                  : 2438                
File System          : xfs                 
Device               : /dev/mapper/vg_data-lv_data
Mount Options        :
rw,noatime,nodiratime,logbufs=8,logbsize=256k,inode64,nobarrier
Inode Size           : 512                 
Disk Space Free      : 12.3TB              
Total Disk Space     : 27.3TB              
Inode Count          : 2929685696          
Free Inodes          : 2839872616          
------------------------------------------------------------------------------
Brick                : Brick hgluster02:/gluster_data
Port                 : 49152               
Online               : Y                   
Pid                  : 2467                
File System          : xfs                 
Device               : /dev/mapper/vg_data-lv_data
Mount Options        :
rw,noatime,nodiratime,logbufs=8,logbsize=256k,inode64,nobarrier
Inode Size           : 512                 
Disk Space Free      : 12.4TB              
Total Disk Space     : 27.3TB              
Inode Count          : 2929685696          
Free Inodes          : 2839847441

[root at hgluster01 ~]# gluster volume info

Volume Name: export_volume
Type: Distribute
Volume ID: c74cc970-31e2-4924-a244-4c70d958dadb
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: hgluster01:/gluster_data
Brick2: hgluster02:/gluster_data
Options Reconfigured:
performance.stat-prefetch: on
performance.write-behind: on
performance.flush-behind: on
features.quota-deem-statfs: on
performance.quick-read: on
performance.client-io-threads: on
performance.read-ahead: on
performance.io-thread-count: 24
features.quota: on
cluster.eager-lock: on
nfs.disable: on
auth.allow: 192.168.10.*,10.0.10.*,10.8.0.*
server.allow-insecure: on
performance.write-behind-window-size: 4MB
network.ping-timeout: 60
features.quota-timeout: 10
performance.io-cache: off

[root at hgluster01 ~]# cat /proc/meminfo 
MemTotal:       32844100 kB
MemFree:         2148772 kB
Buffers:           14184 kB
Cached:            35600 kB
SwapCached:       204288 kB
Active:         24682388 kB
Inactive:        3315448 kB
Active(anon):   24660896 kB
Inactive(anon):  3289292 kB
Active(file):      21492 kB
Inactive(file):    26156 kB
Unevictable:       12728 kB
Mlocked:            4552 kB
SwapTotal:      16490488 kB
SwapFree:       15077012 kB
Dirty:                32 kB
Writeback:             0 kB
AnonPages:      27761596 kB
Mapped:             9168 kB
Shmem:                 4 kB
Slab:             544552 kB
SReclaimable:     273636 kB
SUnreclaim:       270916 kB
KernelStack:        4800 kB
PageTables:        60592 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    32912536 kB
Committed_AS:   29529576 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      345236 kB
VmallocChunk:   34340927412 kB
HardwareCorrupted:     0 kB
AnonHugePages:  17307648 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        4728 kB
DirectMap2M:     2058240 kB
DirectMap1G:    31457280 kB

[root at hgluster01 ~]# pmap -x 2627
2627:   /usr/sbin/glusterfs -s localhost --volfile-id export_volume
--xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes
--xlator-option *dht.assert-no-child-down=yes --xlator-option
*replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off --xlator-option
*replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on
--xlator-option *dht.rebalance-cmd=1 --xlator-option
*dht.node-uuid=875dbae1-82bd-485f-98e
Address           Kbytes     RSS   Dirty Mode   Mapping
0000000000400000      64      20       0 r-x--  glusterfsd
0000000000610000       8       8       8 rw---  glusterfsd
0000000001a15000     132      40      40 rw---    [ anon ]
0000000001a36000 28914064 27616220 27611240 rw---    [ anon ]
00007f0290000000     132      16      16 rw---    [ anon ]
00007f0290021000   65404       0       0 -----    [ anon ]
00007f0297a1e000    6024    1184    1184 rw---    [ anon ]
00007f0298000000     132       0       0 rw---    [ anon ]
00007f0298021000   65404       0       0 -----    [ anon ]
00007f029c000000     132      28      28 rw---    [ anon ]
00007f029c021000   65404       0       0 -----    [ anon ]
00007f02a0000000     132       8       8 rw---    [ anon ]
00007f02a0021000   65404       0       0 -----    [ anon ]
00007f02a4092000    4964      16      16 rw---    [ anon ]
00007f02a456b000       4       0       0 -----    [ anon ]
00007f02a456c000   11844    2056    2056 rw---    [ anon ]
00007f02a50fd000       4       0       0 -----    [ anon ]
00007f02a50fe000    1024       8       8 rw---    [ anon ]
00007f02a51fe000      96      12       0 r-x--  io-stats.so
00007f02a5216000    2048       0       0 -----  io-stats.so
00007f02a5416000       8       0       0 rw---  io-stats.so
00007f02a5418000      96      16       0 r-x--  io-threads.so
00007f02a5430000    2044       0       0 -----  io-threads.so
00007f02a562f000      12       4       4 rw---  io-threads.so
00007f02a5632000      52       0       0 r-x--  md-cache.so
00007f02a563f000    2044       0       0 -----  md-cache.so
00007f02a583e000       8       0       0 rw---  md-cache.so
00007f02a5840000      28       0       0 r-x--  open-behind.so
00007f02a5847000    2048       0       0 -----  open-behind.so
00007f02a5a47000       4       0       0 rw---  open-behind.so
00007f02a5a48000      28       0       0 r-x--  quick-read.so
00007f02a5a4f000    2044       0       0 -----  quick-read.so
00007f02a5c4e000       8       0       0 rw---  quick-read.so
00007f02a5c50000      44       0       0 r-x--  read-ahead.so
00007f02a5c5b000    2044       0       0 -----  read-ahead.so
00007f02a5e5a000       8       0       0 rw---  read-ahead.so
00007f02a5e5c000      48       0       0 r-x--  write-behind.so
00007f02a5e68000    2048       0       0 -----  write-behind.so
00007f02a6068000       8       0       0 rw---  write-behind.so
00007f02a606a000     300     136       0 r-x--  dht.so
00007f02a60b5000    2048       0       0 -----  dht.so
00007f02a62b5000      16       8       8 rw---  dht.so
00007f02a62b9000     240     108       0 r-x--  client.so
00007f02a62f5000    2048       0       0 -----  client.so
00007f02a64f5000      16       8       8 rw---  client.so
00007f02a64f9000       4       0       0 -----    [ anon ]
00007f02a64fa000   10240       8       8 rw---    [ anon ]
00007f02a6efa000      48      12       0 r-x--  libnss_files-2.12.so
00007f02a6f06000    2048       0       0 -----  libnss_files-2.12.so
00007f02a7106000       4       0       0 r----  libnss_files-2.12.so
00007f02a7107000       4       0       0 rw---  libnss_files-2.12.so
00007f02a7108000     116       0       0 r-x--  libselinux.so.1
00007f02a7125000    2044       0       0 -----  libselinux.so.1
00007f02a7324000       4       0       0 r----  libselinux.so.1
00007f02a7325000       4       0       0 rw---  libselinux.so.1
00007f02a7326000       4       0       0 rw---    [ anon ]
00007f02a7327000      88       0       0 r-x--  libresolv-2.12.so
00007f02a733d000    2048       0       0 -----  libresolv-2.12.so
00007f02a753d000       4       0       0 r----  libresolv-2.12.so
00007f02a753e000       4       0       0 rw---  libresolv-2.12.so
00007f02a753f000       8       0       0 rw---    [ anon ]
00007f02a7541000       8       0       0 r-x--  libkeyutils.so.1.3
00007f02a7543000    2044       0       0 -----  libkeyutils.so.1.3
00007f02a7742000       4       0       0 r----  libkeyutils.so.1.3
00007f02a7743000       4       0       0 rw---  libkeyutils.so.1.3
00007f02a7744000      40       0       0 r-x--  libkrb5support.so.0.1
00007f02a774e000    2044       0       0 -----  libkrb5support.so.0.1
00007f02a794d000       4       0       0 r----  libkrb5support.so.0.1
00007f02a794e000       4       0       0 rw---  libkrb5support.so.0.1
00007f02a794f000     164       0       0 r-x--  libk5crypto.so.3.1
00007f02a7978000    2048       0       0 -----  libk5crypto.so.3.1
00007f02a7b78000       4       0       0 r----  libk5crypto.so.3.1
00007f02a7b79000       4       0       0 rw---  libk5crypto.so.3.1
00007f02a7b7a000       4       0       0 rw---    [ anon ]
00007f02a7b7b000      12       0       0 r-x--  libcom_err.so.2.1
00007f02a7b7e000    2044       0       0 -----  libcom_err.so.2.1
00007f02a7d7d000       4       0       0 r----  libcom_err.so.2.1
00007f02a7d7e000       4       0       0 rw---  libcom_err.so.2.1
00007f02a7d7f000     876       0       0 r-x--  libkrb5.so.3.3
00007f02a7e5a000    2044       0       0 -----  libkrb5.so.3.3
00007f02a8059000      40       4       4 r----  libkrb5.so.3.3
00007f02a8063000       8       0       0 rw---  libkrb5.so.3.3
00007f02a8065000     260       0       0 r-x--  libgssapi_krb5.so.2.2
00007f02a80a6000    2048       0       0 -----  libgssapi_krb5.so.2.2
00007f02a82a6000       4       0       0 r----  libgssapi_krb5.so.2.2
00007f02a82a7000       8       4       4 rw---  libgssapi_krb5.so.2.2
00007f02a82a9000     388       4       0 r-x--  libssl.so.1.0.1e
00007f02a830a000    2048       0       0 -----  libssl.so.1.0.1e
00007f02a850a000      16       0       0 r----  libssl.so.1.0.1e
00007f02a850e000      28       0       0 rw---  libssl.so.1.0.1e
00007f02a8515000      60      44       0 r-x--  socket.so
00007f02a8524000    2048       0       0 -----  socket.so
00007f02a8724000      16       8       8 rw---  socket.so
00007f02a8728000       4       0       0 -----    [ anon ]
00007f02a8729000   10240       8       8 rw---    [ anon ]
00007f02a9129000       4       0       0 -----    [ anon ]
00007f02a912a000   10240       8       8 rw---    [ anon ]
00007f02a9b2a000       4       0       0 -----    [ anon ]
00007f02a9b2b000   10240       0       0 rw---    [ anon ]
00007f02aa52b000   20052    6296    6296 rw---    [ anon ]
00007f02ab8c0000      84       0       0 r-x--  libz.so.1.2.3
00007f02ab8d5000    2044       0       0 -----  libz.so.1.2.3
00007f02abad4000       4       0       0 r----  libz.so.1.2.3
00007f02abad5000       4       0       0 rw---  libz.so.1.2.3
00007f02abad6000    1576     756       0 r-x--  libc-2.12.so
00007f02abc60000    2048       0       0 -----  libc-2.12.so
00007f02abe60000      16      16      16 r----  libc-2.12.so
00007f02abe64000       4       4       4 rw---  libc-2.12.so
00007f02abe65000      20      16      16 rw---    [ anon ]
00007f02abe6a000    1748       0       0 r-x--  libcrypto.so.1.0.1e
00007f02ac01f000    2048       0       0 -----  libcrypto.so.1.0.1e
00007f02ac21f000     108       0       0 r----  libcrypto.so.1.0.1e
00007f02ac23a000      48       0       0 rw---  libcrypto.so.1.0.1e
00007f02ac246000      16       0       0 rw---    [ anon ]
00007f02ac24a000      92      72       0 r-x--  libpthread-2.12.so
00007f02ac261000    2048       0       0 -----  libpthread-2.12.so
00007f02ac461000       4       4       4 r----  libpthread-2.12.so
00007f02ac462000       4       4       4 rw---  libpthread-2.12.so
00007f02ac463000      16       4       4 rw---    [ anon ]
00007f02ac467000      28       4       0 r-x--  librt-2.12.so
00007f02ac46e000    2044       0       0 -----  librt-2.12.so
00007f02ac66d000       4       4       4 r----  librt-2.12.so
00007f02ac66e000       4       0       0 rw---  librt-2.12.so
00007f02ac66f000    1396       0       0 r-x--  libpython2.6.so.1.0
00007f02ac7cc000    2044       0       0 -----  libpython2.6.so.1.0
00007f02ac9cb000     240       0       0 rw---  libpython2.6.so.1.0
00007f02aca07000      56       0       0 rw---    [ anon ]
00007f02aca15000     524       0       0 r-x--  libm-2.12.so
00007f02aca98000    2044       0       0 -----  libm-2.12.so
00007f02acc97000       4       0       0 r----  libm-2.12.so
00007f02acc98000       4       0       0 rw---  libm-2.12.so
00007f02acc99000       8       0       0 r-x--  libutil-2.12.so
00007f02acc9b000    2044       0       0 -----  libutil-2.12.so
00007f02ace9a000       4       0       0 r----  libutil-2.12.so
00007f02ace9b000       4       0       0 rw---  libutil-2.12.so
00007f02ace9c000       8       0       0 r-x--  libdl-2.12.so
00007f02ace9e000    2048       0       0 -----  libdl-2.12.so
00007f02ad09e000       4       0       0 r----  libdl-2.12.so
00007f02ad09f000       4       0       0 rw---  libdl-2.12.so
00007f02ad0a0000      88      24       0 r-x--  libgfxdr.so.0.0.0
00007f02ad0b6000    2044       0       0 -----  libgfxdr.so.0.0.0
00007f02ad2b5000       4       4       4 rw---  libgfxdr.so.0.0.0
00007f02ad2b6000      96      64       0 r-x--  libgfrpc.so.0.0.0
00007f02ad2ce000    2048       0       0 -----  libgfrpc.so.0.0.0
00007f02ad4ce000       4       4       4 rw---  libgfrpc.so.0.0.0
00007f02ad4cf000     532     176       0 r-x--  libglusterfs.so.0.0.0
00007f02ad554000    2048       0       0 -----  libglusterfs.so.0.0.0
00007f02ad754000       8       8       8 rw---  libglusterfs.so.0.0.0
00007f02ad756000      16      12      12 rw---    [ anon ]
00007f02ad75a000     128      96       0 r-x--  ld-2.12.so
00007f02ad7ac000    1824      24      24 rw---    [ anon ]
00007f02ad977000       4       0       0 rw---    [ anon ]
00007f02ad978000       4       0       0 rw---    [ anon ]
00007f02ad979000       4       4       4 r----  ld-2.12.so
00007f02ad97a000       4       0       0 rw---  ld-2.12.so
00007f02ad97b000       4       0       0 rw---    [ anon ]
00007fff4c597000     124      96      96 rw---    [ stack ]
00007fff4c5ff000       4       4       0 r-x--    [ anon ]
ffffffffff600000       4       0       0 r-x--    [ anon ]
----------------  ------  ------  ------
total kB        29338940 27627692 27621164

--- Additional comment from Raghavendra G on 2014-09-18 02:53:33 EDT ---

Hi Ryan,

If rebalance is still running, can you please get the statedump of rebalance
process on all the nodes which are part of volume? 

We have to repeat following steps on all the nodes which are part of volume.

1. Get the pid of the rebalance process:

[root at unused glusterfs]# ps ax | grep -i rebalance  | grep glusterfs | cut -d"
" -f 1
16537

2. Get the statedump of rebalance process:
[root at unused glusterfs]# kill -SIGUSR1 16537

3. statedump can be found in /var/run/gluster/

[root at unused glusterfs]# ls /var/run/gluster/*16537*
/var/run/gluster/glusterdump.16537.dump.1411022946


regards,
Raghavendra.

--- Additional comment from Anand Avati on 2014-09-18 05:40:52 EDT ---

REVIEW: http://review.gluster.org/8763 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#1) for review on master by Krutika
Dhananjay (kdhananj at redhat.com)

--- Additional comment from Krutika Dhananjay on 2014-09-18 06:39:40 EDT ---

Hi Ryan,

Thanks for the bug report. We have identified few leaks in rebalance and are in
the process of fixing them.

--- Additional comment from Ryan Clough on 2014-09-18 13:05:11 EDT ---

Unfortunately, the processes had been OOM'd before I got to the office this
morning. The reblance failed.

--- Additional comment from Anand Avati on 2014-09-19 03:26:41 EDT ---

REVIEW: http://review.gluster.org/8776 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#1) for review on release-3.5 by
Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Anand Avati on 2014-09-19 04:13:24 EDT ---

REVIEW: http://review.gluster.org/8763 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#2) for review on master by Krutika
Dhananjay (kdhananj at redhat.com)

--- Additional comment from Anand Avati on 2014-09-19 10:10:33 EDT ---

COMMIT: http://review.gluster.org/8763 committed in master by Vijay Bellur
(vbellur at redhat.com) 
------
commit 258e61adb5505124925c71d2a0d0375d086e32d4
Author: Krutika Dhananjay <kdhananj at redhat.com>
Date:   Thu Sep 18 14:36:38 2014 +0530

    cluster/dht: Fix dict_t leaks in rebalance process' execution path

    Two dict_t objects are leaked for every file migrated in success codepath.
    It is the caller's responsibility to unref dict that it gets from calls to
    syncop_getxattr(); and rebalance performs two syncop_getxattr()s per file
    without freeing them.

    Also, syncop_getxattr() on GF_XATTR_LINKINFO_KEY doesn't seem to be using
    the response dict. Hence, NULL is now passed as opposed to @dict to
    syncop_getxattr().

    Change-Id: I5a4b5ab834df3633dea994f239bbdbc34cbe9259
    BUG: 1142052
    Signed-off-by: Krutika Dhananjay <kdhananj at redhat.com>
    Reviewed-on: http://review.gluster.org/8763
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana at redhat.com>
    Reviewed-by: Vijay Bellur <vbellur at redhat.com>

--- Additional comment from Anand Avati on 2014-09-19 23:44:45 EDT ---

REVIEW: http://review.gluster.org/8784 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#1) for review on release-3.5 by
Krutika Dhananjay (kdhananj at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1142052
[Bug 1142052] Very high memory usage during rebalance
https://bugzilla.redhat.com/show_bug.cgi?id=1144640
[Bug 1144640] Very high memory usage during rebalance
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=69YIkypRNA&a=cc_unsubscribe


More information about the Bugs mailing list