[Gluster-users] Run away memory with gluster mount

Mon Jan 29 19:03:42 UTC 2018

On 1/26/2018 11:53 PM, Ravishankar N wrote:
> 
> 
> On 01/27/2018 02:29 AM, Dan Ragle wrote:
>>
>> On 1/25/2018 8:21 PM, Ravishankar N wrote:
>>>
>>>
>>> On 01/25/2018 11:04 PM, Dan Ragle wrote:
>>>> *sigh* trying again to correct formatting ... apologize for the 
>>>> earlier mess.
>>>>
>>>> Having a memory issue with Gluster 3.12.4 and not sure how to 
>>>> troubleshoot. I don't *think* this is expected behavior.
>>>>
>>>> This is on an updated CentOS 7 box. The setup is a simple two node 
>>>> replicated layout where the two nodes act as both server and
>>>> client.
>>>>
>>>> The volume in question:
>>>>
>>>> Volume Name: GlusterWWW
>>>> Type: Replicate
>>>> Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x 2 = 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www
>>>> Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www
>>>> Options Reconfigured:
>>>> nfs.disable: on
>>>> cluster.favorite-child-policy: mtime
>>>> transport.address-family: inet
>>>>
>>>> I had some other performance options in there, (increased 
>>>> cache-size, md invalidation, etc) but stripped them out in an 
>>>> attempt to
>>>> isolate the issue. Still got the problem without them.
>>>>
>>>> The volume currently contains over 1M files.
>>>>
>>>> When mounting the volume, I get (among other things) a process as such:
>>>>
>>>> /usr/sbin/glusterfs --volfile-server=localhost 
>>>> --volfile-id=/GlusterWWW /var/www
>>>>
>>>> This process begins with little memory, but then as files are 
>>>> accessed in the volume the memory increases. I setup a script that
>>>> simply reads the files in the volume one at a time (no writes). It's 
>>>> been running on and off about 12 hours now and the resident
>>>> memory of the above process is already at 7.5G and continues to grow 
>>>> slowly. If I stop the test script the memory stops growing,
>>>> but does not reduce. Restart the test script and the memory begins 
>>>> slowly growing again.
>>>>
>>>> This is obviously a contrived app environment. With my intended 
>>>> application load it takes about a week or so for the memory to get
>>>> high enough to invoke the oom killer.
>>>
>>> Can you try debugging with the statedump 
>>> (https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/#read-a-statedump) 
>>> of
>>> the fuse mount process and see what member is leaking? Take the 
>>> statedumps in succession, maybe once initially during the I/O and
>>> once the memory gets high enough to hit the OOM mark.
>>> Share the dumps here.
>>>
>>> Regards,
>>> Ravi
>>
>> Thanks for the reply. I noticed yesterday that an update (3.12.5) had 
>> been posted so I went ahead and updated and repeated the test 
>> overnight. The memory usage does not appear to be growing as quickly 
>> as is was with 3.12.4, but does still appear to be growing.
>>
>> I should also mention that there is another process beyond my test app 
>> that is reading the files from the volume. Specifically, there is an 
>> rsync that runs from the second node 2-4 times an hour that reads from 
>> the GlusterWWW volume mounted on node 1. Since none of the files in 
>> that mount are changing it doesn't actually rsync anything, but 
>> nonetheless it is running and reading the files in addition to my test 
>> script. (It's a part of my intended production setup that I forgot was 
>> still running.)
>>
>> The mount process appears to be gaining memory at a rate of about 1GB 
>> every 4 hours or so. At that rate it'll take several days before it 
>> runs the box out of memory. But I took your suggestion and made some 
>> statedumps today anyway, about 2 hours apart, 4 total so far. It looks 
>> like there may already be some actionable information. These are the 
>> only registers where the num_allocs have grown with each of the four 
>> samples:
>>
>> [mount/fuse.fuse - usage-type gf_fuse_mt_gids_t memusage]
>>  ---> num_allocs at Fri Jan 26 08:57:31 2018: 784
>>  ---> num_allocs at Fri Jan 26 10:55:50 2018: 831
>>  ---> num_allocs at Fri Jan 26 12:55:15 2018: 877
>>  ---> num_allocs at Fri Jan 26 14:58:27 2018: 908
>>
>> [mount/fuse.fuse - usage-type gf_common_mt_fd_lk_ctx_t memusage]
>>  ---> num_allocs at Fri Jan 26 08:57:31 2018: 5
>>  ---> num_allocs at Fri Jan 26 10:55:50 2018: 10
>>  ---> num_allocs at Fri Jan 26 12:55:15 2018: 15
>>  ---> num_allocs at Fri Jan 26 14:58:27 2018: 17
>>
>> [cluster/distribute.GlusterWWW-dht - usage-type gf_dht_mt_dht_layout_t 
>> memusage]
>>  ---> num_allocs at Fri Jan 26 08:57:31 2018: 24243596
>>  ---> num_allocs at Fri Jan 26 10:55:50 2018: 27902622
>>  ---> num_allocs at Fri Jan 26 12:55:15 2018: 30678066
>>  ---> num_allocs at Fri Jan 26 14:58:27 2018: 33801036
>>
>> Not sure the best way to get you the full dumps. They're pretty big, 
>> over 1G for all four. Also, I noticed some filepath information in 
>> there that I'd rather not share. What's the recommended next step?
> 
> I've CC'd the fuse/ dht devs to see if these data types have potential 
> leaks. Could you raise a bug with the volume info and a (dropbox?) link 
> from which we can download the dumps? You can remove/replace the 
> filepaths from them.
> 
> Regards.
> Ravi

Filed this bug with links to the tar balled statedumps:

https://bugs.centos.org/view.php?id=14428

Since my testing platform is CentOS I didn't know if it would be 
appropriate to report it in the RedHat bugzilla.

I found in further testing this weekend that I could reproduce the 
problem easily with this:

while true; do /usr/bin/ls -R /var/www > /dev/null 2>&1; sleep 10; done &

If anyone wants the statedumps directly they are here. The first 
(600MB+) has all 18 dumps while the second has only three (start, 18 
hours, 36 hours; 110MB+). The file paths are obfuscated.

https://drive.google.com/file/d/1e9UfkHwWZMhHPC480GYLr0gERV985134/view?usp=sharing
https://drive.google.com/file/d/1HhJ-FTamd44XKZ5NDxaE4IjVtSOUFO83/view?usp=sharing

Reviewing the num_allocs, the gf_dht_mt_dht_layout_t entry still appears 
to be the biggest standout:

[cluster/distribute.GlusterWWW-dht - usage-type gf_dht_mt_dht_layout_t 
memusage]
  ---> num_allocs at Sat Jan 27 20:56:01 2018: 371847
  ---> num_allocs at Sat Jan 27 22:56:01 2018: 6480129
  ---> num_allocs at Sun Jan 28 00:56:01 2018: 13036666
  ---> num_allocs at Sun Jan 28 02:56:01 2018: 19623082
  ---> num_allocs at Sun Jan 28 04:56:01 2018: 26221450
  ---> num_allocs at Sun Jan 28 06:56:01 2018: 32760766
  ---> num_allocs at Sun Jan 28 08:56:01 2018: 39312051
  ---> num_allocs at Sun Jan 28 10:56:01 2018: 45881896
  ---> num_allocs at Sun Jan 28 12:56:01 2018: 52456557
  ---> num_allocs at Sun Jan 28 14:56:01 2018: 59017422
  ---> num_allocs at Sun Jan 28 16:56:01 2018: 65552767
  ---> num_allocs at Sun Jan 28 18:56:01 2018: 72061290
  ---> num_allocs at Sun Jan 28 20:56:01 2018: 78743296
  ---> num_allocs at Sun Jan 28 22:56:01 2018: 85297985
  ---> num_allocs at Mon Jan 29 00:56:01 2018: 91800496
  ---> num_allocs at Mon Jan 29 02:56:01 2018: 98306935
  ---> num_allocs at Mon Jan 29 04:56:01 2018: 104802985
  ---> num_allocs at Mon Jan 29 06:56:01 2018: 111385407
  ---> num_allocs at Mon Jan 29 08:56:01 2018: 117901593

Let me know if I can provide anything further. The test is actually 
still running, currently with RSS memory at 25GB+.

Cheers!

Dan

> 
>>
>> Cheers!
>>
>> Dan
>>
>>>>
>>>> Is there potentially something misconfigured here?
>>>>
>>>> I did see a reference to a memory leak in another thread in this 
>>>> list, but that had to do with the setting of quotas, I don't have
>>>> any quotas set on my system.
>>>>
>>>> Thanks,
>>>>
>>>> Dan Ragle
>>>> daniel at Biblestuph.com
>>>>
>>>> On 1/25/2018 11:04 AM, Dan Ragle wrote:
>>>>> Having a memory issue with Gluster 3.12.4 and not sure how to
>>>>> troubleshoot. I don't *think* this is expected behavior. This is on an
>>>>> updated CentOS 7 box. The setup is a simple two node replicated layout
>>>>> where the two nodes act as both server and client. The volume in
>>>>> question: Volume Name: GlusterWWW Type: Replicate Volume ID:
>>>>> 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3 Status: Started Snapshot Count: 0
>>>>> Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1:
>>>>> vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www Brick2:
>>>>> vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www Options 
>>>>> Reconfigured:
>>>>> nfs.disable: on cluster.favorite-child-policy: mtime
>>>>> transport.address-family: inet I had some other performance options in
>>>>> there, (increased cache-size, md invalidation, etc) but stripped them
>>>>> out in an attempt to isolate the issue. Still got the problem without
>>>>> them. The volume currently contains over 1M files. When mounting the
>>>>> volume, I get (among other things) a process as such:
>>>>> /usr/sbin/glusterfs --volfile-server=localhost 
>>>>> --volfile-id=/GlusterWWW
>>>>> /var/www This process begins with little memory, but then as files are
>>>>> accessed in the volume the memory increases. I setup a script that
>>>>> simply reads the files in the volume one at a time (no writes). It's
>>>>> been running on and off about 12 hours now and the resident memory of
>>>>> the above process is already at 7.5G and continues to grow slowly. 
>>>>> If I
>>>>> stop the test script the memory stops growing, but does not reduce.
>>>>> Restart the test script and the memory begins slowly growing again. 
>>>>> This
>>>>> is obviously a contrived app environment. With my intended application
>>>>> load it takes about a week or so for the memory to get high enough to
>>>>> invoke the oom killer. Is there potentially something misconfigured
>>>>> here? Thanks, Dan Ragle daniel at Biblestuph.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>