[Bugs] [Bug 1144640] New: Very high memory usage during rebalance
bugzilla at redhat.com
bugzilla at redhat.com
Sat Sep 20 03:50:37 UTC 2014
https://bugzilla.redhat.com/show_bug.cgi?id=1144640
Bug ID: 1144640
Summary: Very high memory usage during rebalance
Product: GlusterFS
Version: 3.6.0
Component: distribute
Severity: medium
Assignee: gluster-bugs at redhat.com
Reporter: kdhananj at redhat.com
CC: bugs at gluster.org, kdhananj at redhat.com,
rgowdapp at redhat.com, ryan.clough at dsic.com
Depends On: 1142052
+++ This bug was initially created as a clone of Bug #1142052 +++
Description of problem:
Rebalance has been running for about 7 days on a 2 node, 2 brick, 52TB
distributed volume. Memory usage has grown slowing over time to consume all
available physical memory. OOM killer stopped the last rebalance and then after
restarting all Gluster processes I am attempting it again.
"sync; echo 3 > /proc/sys/vm/drop_caches" has no effect to lower memory
consumption.
Version-Release number of selected component (if applicable):
[root at hgluster01 ~]# glusterfs --version
glusterfs 3.5.2 built on Jul 31 2014 18:47:52
How reproducible:
So far, I cannot complete a rebalance.
Steps to Reproduce:
1. Start rebalance
gluster volume rebalance export_volume start
Actual results:
High memory consumption by glusterfs process gets OOM'd
Expected results:
Rebalance does not consume all available memory and completes the rebalance and
fix-layout.
Additional info:
[root at hgluster01 ~]# gluster volume status export_volume detail
Status of volume: export_volume
------------------------------------------------------------------------------
Brick : Brick hgluster01:/gluster_data
Port : 49152
Online : Y
Pid : 2438
File System : xfs
Device : /dev/mapper/vg_data-lv_data
Mount Options :
rw,noatime,nodiratime,logbufs=8,logbsize=256k,inode64,nobarrier
Inode Size : 512
Disk Space Free : 12.3TB
Total Disk Space : 27.3TB
Inode Count : 2929685696
Free Inodes : 2839872616
------------------------------------------------------------------------------
Brick : Brick hgluster02:/gluster_data
Port : 49152
Online : Y
Pid : 2467
File System : xfs
Device : /dev/mapper/vg_data-lv_data
Mount Options :
rw,noatime,nodiratime,logbufs=8,logbsize=256k,inode64,nobarrier
Inode Size : 512
Disk Space Free : 12.4TB
Total Disk Space : 27.3TB
Inode Count : 2929685696
Free Inodes : 2839847441
[root at hgluster01 ~]# gluster volume info
Volume Name: export_volume
Type: Distribute
Volume ID: c74cc970-31e2-4924-a244-4c70d958dadb
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: hgluster01:/gluster_data
Brick2: hgluster02:/gluster_data
Options Reconfigured:
performance.stat-prefetch: on
performance.write-behind: on
performance.flush-behind: on
features.quota-deem-statfs: on
performance.quick-read: on
performance.client-io-threads: on
performance.read-ahead: on
performance.io-thread-count: 24
features.quota: on
cluster.eager-lock: on
nfs.disable: on
auth.allow: 192.168.10.*,10.0.10.*,10.8.0.*
server.allow-insecure: on
performance.write-behind-window-size: 4MB
network.ping-timeout: 60
features.quota-timeout: 10
performance.io-cache: off
[root at hgluster01 ~]# cat /proc/meminfo
MemTotal: 32844100 kB
MemFree: 2148772 kB
Buffers: 14184 kB
Cached: 35600 kB
SwapCached: 204288 kB
Active: 24682388 kB
Inactive: 3315448 kB
Active(anon): 24660896 kB
Inactive(anon): 3289292 kB
Active(file): 21492 kB
Inactive(file): 26156 kB
Unevictable: 12728 kB
Mlocked: 4552 kB
SwapTotal: 16490488 kB
SwapFree: 15077012 kB
Dirty: 32 kB
Writeback: 0 kB
AnonPages: 27761596 kB
Mapped: 9168 kB
Shmem: 4 kB
Slab: 544552 kB
SReclaimable: 273636 kB
SUnreclaim: 270916 kB
KernelStack: 4800 kB
PageTables: 60592 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 32912536 kB
Committed_AS: 29529576 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 345236 kB
VmallocChunk: 34340927412 kB
HardwareCorrupted: 0 kB
AnonHugePages: 17307648 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 4728 kB
DirectMap2M: 2058240 kB
DirectMap1G: 31457280 kB
[root at hgluster01 ~]# pmap -x 2627
2627: /usr/sbin/glusterfs -s localhost --volfile-id export_volume
--xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes
--xlator-option *dht.assert-no-child-down=yes --xlator-option
*replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off --xlator-option
*replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on
--xlator-option *dht.rebalance-cmd=1 --xlator-option
*dht.node-uuid=875dbae1-82bd-485f-98e
Address Kbytes RSS Dirty Mode Mapping
0000000000400000 64 20 0 r-x-- glusterfsd
0000000000610000 8 8 8 rw--- glusterfsd
0000000001a15000 132 40 40 rw--- [ anon ]
0000000001a36000 28914064 27616220 27611240 rw--- [ anon ]
00007f0290000000 132 16 16 rw--- [ anon ]
00007f0290021000 65404 0 0 ----- [ anon ]
00007f0297a1e000 6024 1184 1184 rw--- [ anon ]
00007f0298000000 132 0 0 rw--- [ anon ]
00007f0298021000 65404 0 0 ----- [ anon ]
00007f029c000000 132 28 28 rw--- [ anon ]
00007f029c021000 65404 0 0 ----- [ anon ]
00007f02a0000000 132 8 8 rw--- [ anon ]
00007f02a0021000 65404 0 0 ----- [ anon ]
00007f02a4092000 4964 16 16 rw--- [ anon ]
00007f02a456b000 4 0 0 ----- [ anon ]
00007f02a456c000 11844 2056 2056 rw--- [ anon ]
00007f02a50fd000 4 0 0 ----- [ anon ]
00007f02a50fe000 1024 8 8 rw--- [ anon ]
00007f02a51fe000 96 12 0 r-x-- io-stats.so
00007f02a5216000 2048 0 0 ----- io-stats.so
00007f02a5416000 8 0 0 rw--- io-stats.so
00007f02a5418000 96 16 0 r-x-- io-threads.so
00007f02a5430000 2044 0 0 ----- io-threads.so
00007f02a562f000 12 4 4 rw--- io-threads.so
00007f02a5632000 52 0 0 r-x-- md-cache.so
00007f02a563f000 2044 0 0 ----- md-cache.so
00007f02a583e000 8 0 0 rw--- md-cache.so
00007f02a5840000 28 0 0 r-x-- open-behind.so
00007f02a5847000 2048 0 0 ----- open-behind.so
00007f02a5a47000 4 0 0 rw--- open-behind.so
00007f02a5a48000 28 0 0 r-x-- quick-read.so
00007f02a5a4f000 2044 0 0 ----- quick-read.so
00007f02a5c4e000 8 0 0 rw--- quick-read.so
00007f02a5c50000 44 0 0 r-x-- read-ahead.so
00007f02a5c5b000 2044 0 0 ----- read-ahead.so
00007f02a5e5a000 8 0 0 rw--- read-ahead.so
00007f02a5e5c000 48 0 0 r-x-- write-behind.so
00007f02a5e68000 2048 0 0 ----- write-behind.so
00007f02a6068000 8 0 0 rw--- write-behind.so
00007f02a606a000 300 136 0 r-x-- dht.so
00007f02a60b5000 2048 0 0 ----- dht.so
00007f02a62b5000 16 8 8 rw--- dht.so
00007f02a62b9000 240 108 0 r-x-- client.so
00007f02a62f5000 2048 0 0 ----- client.so
00007f02a64f5000 16 8 8 rw--- client.so
00007f02a64f9000 4 0 0 ----- [ anon ]
00007f02a64fa000 10240 8 8 rw--- [ anon ]
00007f02a6efa000 48 12 0 r-x-- libnss_files-2.12.so
00007f02a6f06000 2048 0 0 ----- libnss_files-2.12.so
00007f02a7106000 4 0 0 r---- libnss_files-2.12.so
00007f02a7107000 4 0 0 rw--- libnss_files-2.12.so
00007f02a7108000 116 0 0 r-x-- libselinux.so.1
00007f02a7125000 2044 0 0 ----- libselinux.so.1
00007f02a7324000 4 0 0 r---- libselinux.so.1
00007f02a7325000 4 0 0 rw--- libselinux.so.1
00007f02a7326000 4 0 0 rw--- [ anon ]
00007f02a7327000 88 0 0 r-x-- libresolv-2.12.so
00007f02a733d000 2048 0 0 ----- libresolv-2.12.so
00007f02a753d000 4 0 0 r---- libresolv-2.12.so
00007f02a753e000 4 0 0 rw--- libresolv-2.12.so
00007f02a753f000 8 0 0 rw--- [ anon ]
00007f02a7541000 8 0 0 r-x-- libkeyutils.so.1.3
00007f02a7543000 2044 0 0 ----- libkeyutils.so.1.3
00007f02a7742000 4 0 0 r---- libkeyutils.so.1.3
00007f02a7743000 4 0 0 rw--- libkeyutils.so.1.3
00007f02a7744000 40 0 0 r-x-- libkrb5support.so.0.1
00007f02a774e000 2044 0 0 ----- libkrb5support.so.0.1
00007f02a794d000 4 0 0 r---- libkrb5support.so.0.1
00007f02a794e000 4 0 0 rw--- libkrb5support.so.0.1
00007f02a794f000 164 0 0 r-x-- libk5crypto.so.3.1
00007f02a7978000 2048 0 0 ----- libk5crypto.so.3.1
00007f02a7b78000 4 0 0 r---- libk5crypto.so.3.1
00007f02a7b79000 4 0 0 rw--- libk5crypto.so.3.1
00007f02a7b7a000 4 0 0 rw--- [ anon ]
00007f02a7b7b000 12 0 0 r-x-- libcom_err.so.2.1
00007f02a7b7e000 2044 0 0 ----- libcom_err.so.2.1
00007f02a7d7d000 4 0 0 r---- libcom_err.so.2.1
00007f02a7d7e000 4 0 0 rw--- libcom_err.so.2.1
00007f02a7d7f000 876 0 0 r-x-- libkrb5.so.3.3
00007f02a7e5a000 2044 0 0 ----- libkrb5.so.3.3
00007f02a8059000 40 4 4 r---- libkrb5.so.3.3
00007f02a8063000 8 0 0 rw--- libkrb5.so.3.3
00007f02a8065000 260 0 0 r-x-- libgssapi_krb5.so.2.2
00007f02a80a6000 2048 0 0 ----- libgssapi_krb5.so.2.2
00007f02a82a6000 4 0 0 r---- libgssapi_krb5.so.2.2
00007f02a82a7000 8 4 4 rw--- libgssapi_krb5.so.2.2
00007f02a82a9000 388 4 0 r-x-- libssl.so.1.0.1e
00007f02a830a000 2048 0 0 ----- libssl.so.1.0.1e
00007f02a850a000 16 0 0 r---- libssl.so.1.0.1e
00007f02a850e000 28 0 0 rw--- libssl.so.1.0.1e
00007f02a8515000 60 44 0 r-x-- socket.so
00007f02a8524000 2048 0 0 ----- socket.so
00007f02a8724000 16 8 8 rw--- socket.so
00007f02a8728000 4 0 0 ----- [ anon ]
00007f02a8729000 10240 8 8 rw--- [ anon ]
00007f02a9129000 4 0 0 ----- [ anon ]
00007f02a912a000 10240 8 8 rw--- [ anon ]
00007f02a9b2a000 4 0 0 ----- [ anon ]
00007f02a9b2b000 10240 0 0 rw--- [ anon ]
00007f02aa52b000 20052 6296 6296 rw--- [ anon ]
00007f02ab8c0000 84 0 0 r-x-- libz.so.1.2.3
00007f02ab8d5000 2044 0 0 ----- libz.so.1.2.3
00007f02abad4000 4 0 0 r---- libz.so.1.2.3
00007f02abad5000 4 0 0 rw--- libz.so.1.2.3
00007f02abad6000 1576 756 0 r-x-- libc-2.12.so
00007f02abc60000 2048 0 0 ----- libc-2.12.so
00007f02abe60000 16 16 16 r---- libc-2.12.so
00007f02abe64000 4 4 4 rw--- libc-2.12.so
00007f02abe65000 20 16 16 rw--- [ anon ]
00007f02abe6a000 1748 0 0 r-x-- libcrypto.so.1.0.1e
00007f02ac01f000 2048 0 0 ----- libcrypto.so.1.0.1e
00007f02ac21f000 108 0 0 r---- libcrypto.so.1.0.1e
00007f02ac23a000 48 0 0 rw--- libcrypto.so.1.0.1e
00007f02ac246000 16 0 0 rw--- [ anon ]
00007f02ac24a000 92 72 0 r-x-- libpthread-2.12.so
00007f02ac261000 2048 0 0 ----- libpthread-2.12.so
00007f02ac461000 4 4 4 r---- libpthread-2.12.so
00007f02ac462000 4 4 4 rw--- libpthread-2.12.so
00007f02ac463000 16 4 4 rw--- [ anon ]
00007f02ac467000 28 4 0 r-x-- librt-2.12.so
00007f02ac46e000 2044 0 0 ----- librt-2.12.so
00007f02ac66d000 4 4 4 r---- librt-2.12.so
00007f02ac66e000 4 0 0 rw--- librt-2.12.so
00007f02ac66f000 1396 0 0 r-x-- libpython2.6.so.1.0
00007f02ac7cc000 2044 0 0 ----- libpython2.6.so.1.0
00007f02ac9cb000 240 0 0 rw--- libpython2.6.so.1.0
00007f02aca07000 56 0 0 rw--- [ anon ]
00007f02aca15000 524 0 0 r-x-- libm-2.12.so
00007f02aca98000 2044 0 0 ----- libm-2.12.so
00007f02acc97000 4 0 0 r---- libm-2.12.so
00007f02acc98000 4 0 0 rw--- libm-2.12.so
00007f02acc99000 8 0 0 r-x-- libutil-2.12.so
00007f02acc9b000 2044 0 0 ----- libutil-2.12.so
00007f02ace9a000 4 0 0 r---- libutil-2.12.so
00007f02ace9b000 4 0 0 rw--- libutil-2.12.so
00007f02ace9c000 8 0 0 r-x-- libdl-2.12.so
00007f02ace9e000 2048 0 0 ----- libdl-2.12.so
00007f02ad09e000 4 0 0 r---- libdl-2.12.so
00007f02ad09f000 4 0 0 rw--- libdl-2.12.so
00007f02ad0a0000 88 24 0 r-x-- libgfxdr.so.0.0.0
00007f02ad0b6000 2044 0 0 ----- libgfxdr.so.0.0.0
00007f02ad2b5000 4 4 4 rw--- libgfxdr.so.0.0.0
00007f02ad2b6000 96 64 0 r-x-- libgfrpc.so.0.0.0
00007f02ad2ce000 2048 0 0 ----- libgfrpc.so.0.0.0
00007f02ad4ce000 4 4 4 rw--- libgfrpc.so.0.0.0
00007f02ad4cf000 532 176 0 r-x-- libglusterfs.so.0.0.0
00007f02ad554000 2048 0 0 ----- libglusterfs.so.0.0.0
00007f02ad754000 8 8 8 rw--- libglusterfs.so.0.0.0
00007f02ad756000 16 12 12 rw--- [ anon ]
00007f02ad75a000 128 96 0 r-x-- ld-2.12.so
00007f02ad7ac000 1824 24 24 rw--- [ anon ]
00007f02ad977000 4 0 0 rw--- [ anon ]
00007f02ad978000 4 0 0 rw--- [ anon ]
00007f02ad979000 4 4 4 r---- ld-2.12.so
00007f02ad97a000 4 0 0 rw--- ld-2.12.so
00007f02ad97b000 4 0 0 rw--- [ anon ]
00007fff4c597000 124 96 96 rw--- [ stack ]
00007fff4c5ff000 4 4 0 r-x-- [ anon ]
ffffffffff600000 4 0 0 r-x-- [ anon ]
---------------- ------ ------ ------
total kB 29338940 27627692 27621164
--- Additional comment from Raghavendra G on 2014-09-18 02:53:33 EDT ---
Hi Ryan,
If rebalance is still running, can you please get the statedump of rebalance
process on all the nodes which are part of volume?
We have to repeat following steps on all the nodes which are part of volume.
1. Get the pid of the rebalance process:
[root at unused glusterfs]# ps ax | grep -i rebalance | grep glusterfs | cut -d"
" -f 1
16537
2. Get the statedump of rebalance process:
[root at unused glusterfs]# kill -SIGUSR1 16537
3. statedump can be found in /var/run/gluster/
[root at unused glusterfs]# ls /var/run/gluster/*16537*
/var/run/gluster/glusterdump.16537.dump.1411022946
regards,
Raghavendra.
--- Additional comment from Anand Avati on 2014-09-18 05:40:52 EDT ---
REVIEW: http://review.gluster.org/8763 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#1) for review on master by Krutika
Dhananjay (kdhananj at redhat.com)
--- Additional comment from Krutika Dhananjay on 2014-09-18 06:39:40 EDT ---
Hi Ryan,
Thanks for the bug report. We have identified few leaks in rebalance and are in
the process of fixing them.
--- Additional comment from Ryan Clough on 2014-09-18 13:05:11 EDT ---
Unfortunately, the processes had been OOM'd before I got to the office this
morning. The reblance failed.
--- Additional comment from Anand Avati on 2014-09-19 03:26:41 EDT ---
REVIEW: http://review.gluster.org/8776 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#1) for review on release-3.5 by
Krutika Dhananjay (kdhananj at redhat.com)
--- Additional comment from Anand Avati on 2014-09-19 04:13:24 EDT ---
REVIEW: http://review.gluster.org/8763 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#2) for review on master by Krutika
Dhananjay (kdhananj at redhat.com)
--- Additional comment from Anand Avati on 2014-09-19 10:10:33 EDT ---
COMMIT: http://review.gluster.org/8763 committed in master by Vijay Bellur
(vbellur at redhat.com)
------
commit 258e61adb5505124925c71d2a0d0375d086e32d4
Author: Krutika Dhananjay <kdhananj at redhat.com>
Date: Thu Sep 18 14:36:38 2014 +0530
cluster/dht: Fix dict_t leaks in rebalance process' execution path
Two dict_t objects are leaked for every file migrated in success codepath.
It is the caller's responsibility to unref dict that it gets from calls to
syncop_getxattr(); and rebalance performs two syncop_getxattr()s per file
without freeing them.
Also, syncop_getxattr() on GF_XATTR_LINKINFO_KEY doesn't seem to be using
the response dict. Hence, NULL is now passed as opposed to @dict to
syncop_getxattr().
Change-Id: I5a4b5ab834df3633dea994f239bbdbc34cbe9259
BUG: 1142052
Signed-off-by: Krutika Dhananjay <kdhananj at redhat.com>
Reviewed-on: http://review.gluster.org/8763
Tested-by: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana at redhat.com>
Reviewed-by: Vijay Bellur <vbellur at redhat.com>
--- Additional comment from Anand Avati on 2014-09-19 23:44:45 EDT ---
REVIEW: http://review.gluster.org/8784 (cluster/dht: Fix dict_t leaks in
rebalance process' execution path) posted (#1) for review on release-3.5 by
Krutika Dhananjay (kdhananj at redhat.com)
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1142052
[Bug 1142052] Very high memory usage during rebalance
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=zCaEgIfjL6&a=cc_unsubscribe
More information about the Bugs
mailing list