[Bugs] [Bug 1329335] New: GlusterFS - Memory Leak - High Memory Utilization
bugzilla at redhat.com
bugzilla at redhat.com
Thu Apr 21 16:06:30 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1329335
Bug ID: 1329335
Summary: GlusterFS - Memory Leak - High Memory Utilization
Product: GlusterFS
Version: 3.7.11
Component: glusterd
Severity: urgent
Assignee: bugs at gluster.org
Reporter: uganit at gmail.com
CC: bugs at gluster.org
Created attachment 1149509
--> https://bugzilla.redhat.com/attachment.cgi?id=1149509&action=edit
GlusterFS dump file
We are using GlusterFS 3.7.11 (upgraded from 3.7.6 last week) on RHEL 7.x in
AWS EC2.
We continue to see memory utilization going up once every 2 days. The memory
utilization of the server demon(glusterd) in NFS server is keep on increasing.
In about 30+ hours the Memory utilization of glusterd service alone will reach
70% of memory available. Since we have alarms for this threshold, we get
notified and only way to stop it so far is to restart the glusterd.
This happens even where there’s not a lot of load in GlusterFS.
The GlusterFS is configured in the two server nodes with two mount location.
$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/xvdf 125829120 120186 125708934 1% /nfs_app1
/dev/xvdg 125829120 142937 125686183 1% /nfs_app2
As part of debugging, we tried the following:
1. From the client side, in the mount location, we tried to read and write
around 1000 files (each of 4MB size). There was no marked spike in memory
utilization during this time.
2. We were using GlusterFS 3.7.6 and moved to 3.7.11 and despite that the
problem persists.
3. We created a dump of the volume in question. The dump file is attached.
Some of memory allocations such as gf_common_mt_asptinlf_memoryusage has huge
total_allocs. Specifically 3 of them that are listed below.
[global.glusterfs - usage-type gf_common_mt_asprintf memusage]
size=260
num_allocs=12
max_size=2464
max_num_allocs=294
total_allocs=927964
[global.glusterfs - usage-type gf_common_mt_char memusage]
size=6388
num_allocs=164
max_size=30134
max_num_allocs=645
total_allocs=1424017
[protocol/server.xyz-server - usage-type gf_common_mt_strdup memusage]
size=26055
num_allocs=2795
max_size=27198
max_num_allocs=2828
total_allocs=135503
4. We also noticed that the mempool has nr_files as a negative number. Not
sure if this is also a cause of the problem.
[mempool]
[storage/posix.xyz-posix]
base_path=/nfs_xyz/abc
base_path_length=25
max_read=44215866
max_write=104925485
nr_files=-418
This is happening in Prod and as expected generates a lot of problems.
Has anybody seen this before? Any insights into what we can try would be
greatly appreciated.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list