[Gluster-users] Run away memory with gluster mount

Dan Ragle daniel at Biblestuph.com
Mon Jan 29 15:19:32 UTC 2018



On 1/29/2018 12:19 AM, Nithya Balachandran wrote:
> Csaba,
> 
> Could this be the problem of the inodes not getting freed in the fuse
> process?
> 
> Daniel,
> as Ravi requested, please provide access to the statedumps. You can strip
> out the filepath information.

Working on filing a bug report and getting you the dumps now. Will 
update soon.

> Does your data set include a lot of directories?

The volume in question has 1M+ files and 77k+ directories.

Cheers!

Dan

> 
> 
> Thanks,
> Nithya
> 
> On 27 January 2018 at 10:23, Ravishankar N <ravishankar at redhat.com> wrote:
> 
>>
>>
>> On 01/27/2018 02:29 AM, Dan Ragle wrote:
>>
>>>
>>> On 1/25/2018 8:21 PM, Ravishankar N wrote:
>>>
>>>>
>>>>
>>>> On 01/25/2018 11:04 PM, Dan Ragle wrote:
>>>>
>>>>> *sigh* trying again to correct formatting ... apologize for the earlier
>>>>> mess.
>>>>>
>>>>> Having a memory issue with Gluster 3.12.4 and not sure how to
>>>>> troubleshoot. I don't *think* this is expected behavior.
>>>>>
>>>>> This is on an updated CentOS 7 box. The setup is a simple two node
>>>>> replicated layout where the two nodes act as both server and
>>>>> client.
>>>>>
>>>>> The volume in question:
>>>>>
>>>>> Volume Name: GlusterWWW
>>>>> Type: Replicate
>>>>> Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 1 x 2 = 2
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www
>>>>> Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www
>>>>> Options Reconfigured:
>>>>> nfs.disable: on
>>>>> cluster.favorite-child-policy: mtime
>>>>> transport.address-family: inet
>>>>>
>>>>> I had some other performance options in there, (increased cache-size,
>>>>> md invalidation, etc) but stripped them out in an attempt to
>>>>> isolate the issue. Still got the problem without them.
>>>>>
>>>>> The volume currently contains over 1M files.
>>>>>
>>>>> When mounting the volume, I get (among other things) a process as such:
>>>>>
>>>>> /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/GlusterWWW
>>>>> /var/www
>>>>>
>>>>> This process begins with little memory, but then as files are accessed
>>>>> in the volume the memory increases. I setup a script that
>>>>> simply reads the files in the volume one at a time (no writes). It's
>>>>> been running on and off about 12 hours now and the resident
>>>>> memory of the above process is already at 7.5G and continues to grow
>>>>> slowly. If I stop the test script the memory stops growing,
>>>>> but does not reduce. Restart the test script and the memory begins
>>>>> slowly growing again.
>>>>>
>>>>> This is obviously a contrived app environment. With my intended
>>>>> application load it takes about a week or so for the memory to get
>>>>> high enough to invoke the oom killer.
>>>>>
>>>>
>>>> Can you try debugging with the statedump (https://gluster.readthedocs.i
>>>> o/en/latest/Troubleshooting/statedump/#read-a-statedump) of
>>>> the fuse mount process and see what member is leaking? Take the
>>>> statedumps in succession, maybe once initially during the I/O and
>>>> once the memory gets high enough to hit the OOM mark.
>>>> Share the dumps here.
>>>>
>>>> Regards,
>>>> Ravi
>>>>
>>>
>>> Thanks for the reply. I noticed yesterday that an update (3.12.5) had
>>> been posted so I went ahead and updated and repeated the test overnight.
>>> The memory usage does not appear to be growing as quickly as is was with
>>> 3.12.4, but does still appear to be growing.
>>>
>>> I should also mention that there is another process beyond my test app
>>> that is reading the files from the volume. Specifically, there is an rsync
>>> that runs from the second node 2-4 times an hour that reads from the
>>> GlusterWWW volume mounted on node 1. Since none of the files in that mount
>>> are changing it doesn't actually rsync anything, but nonetheless it is
>>> running and reading the files in addition to my test script. (It's a part
>>> of my intended production setup that I forgot was still running.)
>>>
>>> The mount process appears to be gaining memory at a rate of about 1GB
>>> every 4 hours or so. At that rate it'll take several days before it runs
>>> the box out of memory. But I took your suggestion and made some statedumps
>>> today anyway, about 2 hours apart, 4 total so far. It looks like there may
>>> already be some actionable information. These are the only registers where
>>> the num_allocs have grown with each of the four samples:
>>>
>>> [mount/fuse.fuse - usage-type gf_fuse_mt_gids_t memusage]
>>>   ---> num_allocs at Fri Jan 26 08:57:31 2018: 784
>>>   ---> num_allocs at Fri Jan 26 10:55:50 2018: 831
>>>   ---> num_allocs at Fri Jan 26 12:55:15 2018: 877
>>>   ---> num_allocs at Fri Jan 26 14:58:27 2018: 908
>>>
>>> [mount/fuse.fuse - usage-type gf_common_mt_fd_lk_ctx_t memusage]
>>>   ---> num_allocs at Fri Jan 26 08:57:31 2018: 5
>>>   ---> num_allocs at Fri Jan 26 10:55:50 2018: 10
>>>   ---> num_allocs at Fri Jan 26 12:55:15 2018: 15
>>>   ---> num_allocs at Fri Jan 26 14:58:27 2018: 17
>>>
>>> [cluster/distribute.GlusterWWW-dht - usage-type gf_dht_mt_dht_layout_t
>>> memusage]
>>>   ---> num_allocs at Fri Jan 26 08:57:31 2018: 24243596
>>>   ---> num_allocs at Fri Jan 26 10:55:50 2018: 27902622
>>>   ---> num_allocs at Fri Jan 26 12:55:15 2018: 30678066
>>>   ---> num_allocs at Fri Jan 26 14:58:27 2018: 33801036
>>>
>>> Not sure the best way to get you the full dumps. They're pretty big, over
>>> 1G for all four. Also, I noticed some filepath information in there that
>>> I'd rather not share. What's the recommended next step?
>>>
>>
>> I've CC'd the fuse/ dht devs to see if these data types have potential
>> leaks. Could you raise a bug with the volume info and a (dropbox?) link
>> from which we can download the dumps? You can remove/replace the filepaths
>> from them.
>>
>> Regards.
>> Ravi
>>
>>
>>
>>> Cheers!
>>>
>>> Dan
>>>
>>>
>>>>> Is there potentially something misconfigured here?
>>>>>
>>>>> I did see a reference to a memory leak in another thread in this list,
>>>>> but that had to do with the setting of quotas, I don't have
>>>>> any quotas set on my system.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dan Ragle
>>>>> daniel at Biblestuph.com
>>>>>
>>>>> On 1/25/2018 11:04 AM, Dan Ragle wrote:
>>>>>
>>>>>> Having a memory issue with Gluster 3.12.4 and not sure how to
>>>>>> troubleshoot. I don't *think* this is expected behavior. This is on an
>>>>>> updated CentOS 7 box. The setup is a simple two node replicated layout
>>>>>> where the two nodes act as both server and client. The volume in
>>>>>> question: Volume Name: GlusterWWW Type: Replicate Volume ID:
>>>>>> 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3 Status: Started Snapshot Count: 0
>>>>>> Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1:
>>>>>> vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www Brick2:
>>>>>> vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www Options
>>>>>> Reconfigured:
>>>>>> nfs.disable: on cluster.favorite-child-policy: mtime
>>>>>> transport.address-family: inet I had some other performance options in
>>>>>> there, (increased cache-size, md invalidation, etc) but stripped them
>>>>>> out in an attempt to isolate the issue. Still got the problem without
>>>>>> them. The volume currently contains over 1M files. When mounting the
>>>>>> volume, I get (among other things) a process as such:
>>>>>> /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/GlusterWWW
>>>>>> /var/www This process begins with little memory, but then as files are
>>>>>> accessed in the volume the memory increases. I setup a script that
>>>>>> simply reads the files in the volume one at a time (no writes). It's
>>>>>> been running on and off about 12 hours now and the resident memory of
>>>>>> the above process is already at 7.5G and continues to grow slowly. If I
>>>>>> stop the test script the memory stops growing, but does not reduce.
>>>>>> Restart the test script and the memory begins slowly growing again.
>>>>>> This
>>>>>> is obviously a contrived app environment. With my intended application
>>>>>> load it takes about a week or so for the memory to get high enough to
>>>>>> invoke the oom killer. Is there potentially something misconfigured
>>>>>> here? Thanks, Dan Ragle daniel at Biblestuph.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
> 


More information about the Gluster-users mailing list