[Gluster-devel] 3.5.0beta3 memory leak problem
Yuan Ding
qq327662250 at gmail.com
Fri Feb 21 04:01:41 UTC 2014
Hi Vijay,
I make following test:
Start glusterfs volume, kill glusterfsd, and start glusterfsd with
following command:
valgrind --log-file=/root/dingyuan/logs/valgrind.log /usr/sbin/glusterfsd
-s server241 --volfile-id vol1.server241.fsmnt-fs1 -p
/var/lib/glusterd/vols/vol1/run/server241-fsmnt-fs1.pid -S
/var/run/4f8241255dc7142a794af68d66dcedeb.socket --brick-name /fsmnt/fs1 -l
/var/log/glusterfs/bricks/fsmnt-fs1.log --xlator-option
*-posix.glusterd-uuid=41da2eae-c2c8-41a0-8873-5286699a8b95 --brick-port
49153 --xlator-option vol1-server.listen-port=49153 -N
The command line option is the same with default command line option except
the red region.
Then mount nfs client, run ltp test.
After a few minutes, valgrind seems run into a dead loop. top shows
below:(glusterfsd run in the process 'memcheck-amd64-')
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21255 root 20 0 309m 106m 4328 R 100.1 1.4 1121:42
memcheck-amd64-
The process can not be killed by SIGTERM. SIGKILL can kill it, but no
valgrind report generated...
Is there something wrong with my test procedure. Or is there other method
to catch more information?
Thanks!
On Wed, Feb 19, 2014 at 2:20 PM, Vijay Bellur <vbellur at redhat.com> wrote:
> On 02/18/2014 03:18 PM, Yuan Ding wrote:
>
>> I tested gluster nfs server with 1 nfs client. And run ltp's fs test
>> cases on that nfs client. There seems to have 2 memory leak problem.
>> (See my nfs server & 2 glusterfsd config file is in attach)
>> The 2 problem describes below:
>>
>> 1. The glusterfs runs as nfs server exhaust system memory(1GB) in server
>> minutes. After disable drc, this problem no longer exist.
>>
>> 2. After disable drc, the test run 1 day with no problem. But I found
>> glusterfsd used more than 50% system memory(ps command line output sees
>> below). Stop the test can not release memory.
>>
>> [root at server155 ~]# ps aux | grep glusterfsd
>> root 7443 3.7 52.8 1731340 539108 ? Ssl Feb17 70:01
>> /usr/sbin/glusterfsd -s server155 --volfile-id vol1.server155.fsmnt-fs1
>> -p /var/lib/glusterd/vols/vol1/run/server155-fsmnt-fs1.pid -S
>> /var/run/5b7fe23f0aec78ffa0e6968dece0a8b0.socket --brick-name /fsmnt/fs1
>> -l /var/log/glusterfs/bricks/fsmnt-fs1.log --xlator-option
>> *-posix.glusterd-uuid=d4f3d342-dd41-4dc7-b0fc-d3ce9998d21f --brick-port
>> 49152 --xlator-option vol1-server.listen-port=49152
>>
>> I use kill -SIGUSR1 7443 to collected some dump information(in attached
>> fsmnt-fs1.7443.dump.1392711830).
>>
>> Any help is appreciate!
>>
>
> Thanks for the report, there seem to be a lot of dict_t allocations as
> seen from statedump. Would it be possible to run the tests after starting
> glusterfsd with valgrind and share the report here?
>
> -Vijay
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140221/fe66880e/attachment-0001.html>
More information about the Gluster-devel
mailing list