[Gluster-users] Fuse memleaks, all versions → OOM-killer
Yannick Perret
yannick.perret at liris.cnrs.fr
Mon Aug 29 10:32:59 UTC 2016
Hello,
back after holidays. I don't saw any new relies after this last mail, I
hope I don't missed mails (too many mails to parse…).
BTW it seems that my problem is very similar to this opened bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1369364
-> memory usage always increasing for (here) read ops until reaching all
mem/swap, using the fuse client.
Regards,
--
Y.
Le 02/08/2016 à 19:15, Yannick Perret a écrit :
> In order to prevent too many swap usage I removed swap on this machine
> (swapoff -a).
> Memory usage was still growing.
> After that I started an other program that takes memory (in order to
> accelerate things) and I got the OOM-killer.
>
> Here is the syslog:
> [1246854.291996] Out of memory: Kill process 931 (glusterfs) score 742
> or sacrifice child
> [1246854.292102] Killed process 931 (glusterfs) total-vm:3527624kB,
> anon-rss:3100328kB, file-rss:0kB
>
> Last VSZ/RSS was: 3527624 / 3097096
>
>
> Here is the rest of the OOM-killer data:
> [1246854.291847] active_anon:600785 inactive_anon:377188 isolated_anon:0
> active_file:97 inactive_file:137 isolated_file:0
> unevictable:0 dirty:0 writeback:1 unstable:0
> free:21740 slab_reclaimable:3309 slab_unreclaimable:3728
> mapped:255 shmem:4267 pagetables:3286 bounce:0
> free_cma:0
> [1246854.291851] Node 0 DMA free:15876kB min:264kB low:328kB
> high:396kB active_anon:0kB inactive_anon:0kB active_file:0kB
> inactive_file:0kB unevictable:0kB isolated(anon):0kB
> isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB
> dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
> slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
> bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? yes
> [1246854.291858] lowmem_reserve[]: 0 2980 3948 3948
> [1246854.291861] Node 0 DMA32 free:54616kB min:50828kB low:63532kB
> high:76240kB active_anon:1940432kB inactive_anon:1020924kB
> active_file:248kB inactive_file:260kB unevictable:0kB
> isolated(anon):0kB isolated(file):0kB present:3129280kB
> managed:3054836kB mlocked:0kB dirty:0kB writeback:0kB mapped:760kB
> shmem:14616kB slab_reclaimable:9660kB slab_unreclaimable:8244kB
> kernel_stack:1456kB pagetables:10056kB unstable:0kB bounce:0kB
> free_cma:0kB writeback_tmp:0kB pages_scanned:803 all_unreclaimable? yes
> [1246854.291865] lowmem_reserve[]: 0 0 967 967
> [1246854.291867] Node 0 Normal free:16468kB min:16488kB low:20608kB
> high:24732kB active_anon:462708kB inactive_anon:487828kB
> active_file:140kB inactive_file:288kB unevictable:0kB
> isolated(anon):0kB isolated(file):0kB present:1048576kB
> managed:990356kB mlocked:0kB dirty:0kB writeback:4kB mapped:260kB
> shmem:2452kB slab_reclaimable:3576kB slab_unreclaimable:6668kB
> kernel_stack:560kB pagetables:3088kB unstable:0kB bounce:0kB
> free_cma:0kB writeback_tmp:0kB pages_scanned:975 all_unreclaimable? yes
> [1246854.291872] lowmem_reserve[]: 0 0 0 0
> [1246854.291874] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 2*32kB (U) 3*64kB
> (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB
> (EM) = 15876kB
> [1246854.291882] Node 0 DMA32: 1218*4kB (UEM) 848*8kB (UE) 621*16kB
> (UE) 314*32kB (UEM) 189*64kB (UEM) 49*128kB (UEM) 2*256kB (E) 0*512kB
> 0*1024kB 0*2048kB 1*4096kB (R) = 54616kB
> [1246854.291891] Node 0 Normal: 3117*4kB (UE) 0*8kB 0*16kB 3*32kB (R)
> 1*64kB (R) 2*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 1*2048kB (R)
> 0*4096kB = 16468kB
> [1246854.291900] Node 0 hugepages_total=0 hugepages_free=0
> hugepages_surp=0 hugepages_size=2048kB
> [1246854.291902] 4533 total pagecache pages
> [1246854.291903] 0 pages in swap cache
> [1246854.291905] Swap cache stats: add 343501, delete 343501, find
> 7730690/7732743
> [1246854.291906] Free swap = 0kB
> [1246854.291907] Total swap = 0kB
> [1246854.291908] 1048462 pages RAM
> [1246854.291909] 0 pages HighMem/MovableOnly
> [1246854.291909] 14555 pages reserved
> [1246854.291910] 0 pages hwpoisoned
>
> Regards,
> --
> Y.
>
>
>
> Le 02/08/2016 à 17:00, Yannick Perret a écrit :
>> So here are the dumps, gzip'ed.
>>
>> What I did:
>> 1. mounting the volume, removing all its content, umounting it
>> 2. mounting the volume
>> 3. performing a cp -Rp /usr/* /root/MNT
>> 4. performing a rm -rf /root/MNT/*
>> 5. taking a dump (glusterdump.p1.dump)
>> 6. re-doing 3, 4 and 5 (glusterdump.p2.dump)
>>
>> VSZ/RSS are respectively:
>> - 381896 / 35688 just after mount
>> - 644040 / 309240 after 1st cp -Rp
>> - 644040 / 310128 after 1st rm -rf
>> - 709576 / 310128 after 1st kill -USR1
>> - 840648 / 421964 after 2nd cp -Rp
>> - 840648 / 422224 after 2nd rm -rf
>>
>> I created a small script that performs these actions in an infinite loop:
>> while /bin/true
>> do
>> cp -Rp /usr/* /root/MNT/
>> + get VSZ/RSS of glusterfs process
>> rm -rf /root/MNT/*
>> + get VSZ/RSS of glusterfs process
>> done
>>
>> At this time here are the values so far:
>> 971720 533988
>> 1037256 645500
>> 1037256 645840
>> 1168328 757348
>> 1168328 757620
>> 1299400 869128
>> 1299400 869328
>> 1364936 980712
>> 1364936 980944
>> 1496008 1092384
>> 1496008 1092404
>> 1627080 1203796
>> 1627080 1203996
>> 1692616 1315572
>> 1692616 1315504
>> 1823688 1426812
>> 1823688 1427340
>> 1954760 1538716
>> 1954760 1538772
>> 2085832 1647676
>> 2085832 1647708
>> 2151368 1750392
>> 2151368 1750708
>> 2282440 1853864
>> 2282440 1853764
>> 2413512 1952668
>> 2413512 1952704
>> 2479048 2056500
>> 2479048 2056712
>>
>> So at this time glusterfs process takes not far from 2Gb of resident
>> memory, only performing exactly the same actions 'cp -Rp /usr/*
>> /root/MNT' + 'rm -rf /root/MNT/*'.
>>
>> Swap usage is starting to increase a little, and I don't saw any
>> memory dropping at this time.
>> I can understand that kernel may not release the removed files (after
>> rm -rf) immediatly, but the fist 'rm' occured at ~12:00 today and it
>> is ~17:00 here so I can't understand why so much memory is used.
>> I would expect the memory to grow during 'cp -Rp', then reduce after
>> 'rm', but it stays the same. Even if it stays the same I would expect
>> it to not grow more while cp-ing again.
>>
>> I let the cp/rm loop running to see what will happen. Feel free to
>> ask for other data if it may help.
>>
>> Please note that I'll be in hollidays at the end of this week for 3
>> weeks so I will mostly not be able to perform tests during this time
>> (network connection is too bad where I go).
>>
>> Regards,
>> --
>> Y.
>>
>> Le 02/08/2016 à 05:11, Pranith Kumar Karampuri a écrit :
>>>
>>>
>>> On Mon, Aug 1, 2016 at 3:40 PM, Yannick Perret
>>> <yannick.perret at liris.cnrs.fr <mailto:yannick.perret at liris.cnrs.fr>>
>>> wrote:
>>>
>>> Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a écrit :
>>>>
>>>>
>>>> On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret
>>>> <yannick.perret at liris.cnrs.fr
>>>> <mailto:yannick.perret at liris.cnrs.fr>> wrote:
>>>>
>>>> Ok, last try:
>>>> after investigating more versions I found that FUSE client
>>>> leaks memory on all of them.
>>>> I tested:
>>>> - 3.6.7 client on debian 7 32bit and on debian 8 64bit
>>>> (with 3.6.7 serveurs on debian 8 64bit)
>>>> - 3.6.9 client on debian 7 32bit and on debian 8 64bit
>>>> (with 3.6.7 serveurs on debian 8 64bit)
>>>> - 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on
>>>> debian 8 64bit)
>>>> - 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on
>>>> debian 8 64bit)
>>>> In all cases compiled from sources, appart for 3.8.1 where
>>>> .deb were used (due to a configure runtime error).
>>>> For 3.7 it was compiled with --disable-tiering. I also
>>>> tried to compile with --disable-fusermount (no change).
>>>>
>>>> In all of these cases the memory (resident & virtual) of
>>>> glusterfs process on client grows on each activity and
>>>> never reach a max (and never reduce).
>>>> "Activity" for these tests is cp -Rp and ls -lR.
>>>> The client I let grows the most overreached ~4Go RAM. On
>>>> smaller machines it ends by OOM killer killing glusterfs
>>>> process or glusterfs dying due to allocation error.
>>>>
>>>> In 3.6 mem seems to grow continusly, whereas in 3.8.1 it
>>>> grows by "steps" (430400 ko → 629144 (~1min) → 762324
>>>> (~1min) → 827860…).
>>>>
>>>> All tests performed on a single test volume used only by my
>>>> test client. Volume in a basic x2 replica. The only
>>>> parameters I changed on this volume (without any effect)
>>>> are diagnostics.client-log-level set to ERROR and
>>>> network.inode-lru-limit set to 1024.
>>>>
>>>>
>>>> Could you attach statedumps of your runs?
>>>> The following link has steps to capture
>>>> this(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/
>>>> ). We basically need to see what are the memory types that are
>>>> increasing. If you could help find the issue, we can send the
>>>> fixes for your workload. There is a 3.8.2 release in around 10
>>>> days I think. We can probably target this issue for that?
>>> Here are statedumps.
>>> Steps:
>>> 1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ
>>> and RSS are 381896 35828)
>>> 2. take a dump with kill -USR1 <pid-of-glusterfs-process> (file
>>> glusterdump.n1.dump.1470042769)
>>> 3. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is
>>> 518396 :)) and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are
>>> 1301536/711992 at end of these operations)
>>> 4. take a dump with kill -USR1 <pid-of-glusterfs-process> (file
>>> glusterdump.n2.dump.1470043929)
>>> 5. do 'cp -Rp * /root/MNT/toto/', so on an other directory
>>> (VSZ/RSS are 1432608/909968 at end of this operation)
>>> 6. take a dump with kill -USR1 <pid-of-glusterfs-process> (file
>>> glusterdump.n3.dump.)
>>>
>>>
>>> Hey,
>>> Thanks a lot for providing this information. Looking at these
>>> steps, I don't see any problem for the increase in memory. Both ls
>>> -lR and cp -Rp commands you did in the step-3 will add new inodes in
>>> memory which increase the memory. What happens is as long as the
>>> kernel thinks these inodes need to be in memory gluster keeps them
>>> in memory. Once kernel doesn't think the inode is necessary, it
>>> sends 'inode-forgets'. At this point the memory starts reducing. So
>>> it kind of depends on the memory pressure kernel is under. But you
>>> said it lead to OOM-killers on smaller machines which means there
>>> could be some leaks. Could you modify the steps as follows to check
>>> to confirm there are leaks? Please do this test on those smaller
>>> machines which lead to OOM-killers.
>>>
>>> Steps:
>>> 1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and
>>> RSS are 381896 35828)
>>> 2. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is
>>> 518396 :)) and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are
>>> 1301536/711992 at end of these operations)
>>> 3. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS
>>> are 1432608/909968 at end of this operation)
>>> 4. Delete all the files and directories you created in steps 2, 3 above
>>> 5. Take statedump with kill -USR1 <pid-of-glusterfs-process>
>>> 6. Repeat steps from 2-5
>>>
>>> Attach these two statedumps. I think the statedumps will be even
>>> more affective if the mount does not have any data when you start
>>> the experiment.
>>>
>>> HTH
>>>
>>>
>>> Dump files are gzip'ed because they are very large.
>>> Dump files are here (too big for email):
>>> http://wikisend.com/download/623430/glusterdump.n1.dump.1470042769.gz
>>> http://wikisend.com/download/771220/glusterdump.n2.dump.1470043929.gz
>>> http://wikisend.com/download/428752/glusterdump.n3.dump.1470045181.gz
>>> (I keep the files if someone whats them in an other format)
>>>
>>> Client and servers are installed from .deb files
>>> (glusterfs-client_3.8.1-1_amd64.deb and
>>> glusterfs-common_3.8.1-1_amd64.deb on client side).
>>> They are all Debian 8 64bit. Servers are test machines that
>>> serve only one volume to this sole client. Volume is a simple x2
>>> replica. I just changed for test network.inode-lru-limit value
>>> to 1024. Mount point /root/MNT is only used for these tests.
>>>
>>> --
>>> Y.
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Pranith
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160829/be7afd26/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3369 bytes
Desc: Signature cryptographique S/MIME
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160829/be7afd26/attachment.p7s>
More information about the Gluster-users
mailing list