[Gluster-users] Run away memory with gluster mount

Fri Jan 26 01:21:11 UTC 2018

On 01/25/2018 11:04 PM, Dan Ragle wrote:
> *sigh* trying again to correct formatting ... apologize for the 
> earlier mess.
>
> Having a memory issue with Gluster 3.12.4 and not sure how to 
> troubleshoot. I don't *think* this is expected behavior.
>
> This is on an updated CentOS 7 box. The setup is a simple two node 
> replicated layout where the two nodes act as both server and client.
>
> The volume in question:
>
> Volume Name: GlusterWWW
> Type: Replicate
> Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www
> Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www
> Options Reconfigured:
> nfs.disable: on
> cluster.favorite-child-policy: mtime
> transport.address-family: inet
>
> I had some other performance options in there, (increased cache-size, 
> md invalidation, etc) but stripped them out in an attempt to isolate 
> the issue. Still got the problem without them.
>
> The volume currently contains over 1M files.
>
> When mounting the volume, I get (among other things) a process as such:
>
> /usr/sbin/glusterfs --volfile-server=localhost 
> --volfile-id=/GlusterWWW /var/www
>
> This process begins with little memory, but then as files are accessed 
> in the volume the memory increases. I setup a script that simply reads 
> the files in the volume one at a time (no writes). It's been running 
> on and off about 12 hours now and the resident memory of the above 
> process is already at 7.5G and continues to grow slowly. If I stop the 
> test script the memory stops growing, but does not reduce. Restart the 
> test script and the memory begins slowly growing again.
>
> This is obviously a contrived app environment. With my intended 
> application load it takes about a week or so for the memory to get 
> high enough to invoke the oom killer.

Can you try debugging with the statedump 
(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/#read-a-statedump) 
of the fuse mount process and see what member is leaking? Take the 
statedumps in succession, maybe once initially during the I/O and once 
the memory gets high enough to hit the OOM mark.
Share the dumps here.

Regards,
Ravi
>
> Is there potentially something misconfigured here?
>
> I did see a reference to a memory leak in another thread in this list, 
> but that had to do with the setting of quotas, I don't have any quotas 
> set on my system.
>
> Thanks,
>
> Dan Ragle
> daniel at Biblestuph.com
>
> On 1/25/2018 11:04 AM, Dan Ragle wrote:
>> Having a memory issue with Gluster 3.12.4 and not sure how to
>> troubleshoot. I don't *think* this is expected behavior. This is on an
>> updated CentOS 7 box. The setup is a simple two node replicated layout
>> where the two nodes act as both server and client. The volume in
>> question: Volume Name: GlusterWWW Type: Replicate Volume ID:
>> 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3 Status: Started Snapshot Count: 0
>> Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1:
>> vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www Brick2:
>> vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www Options Reconfigured:
>> nfs.disable: on cluster.favorite-child-policy: mtime
>> transport.address-family: inet I had some other performance options in
>> there, (increased cache-size, md invalidation, etc) but stripped them
>> out in an attempt to isolate the issue. Still got the problem without
>> them. The volume currently contains over 1M files. When mounting the
>> volume, I get (among other things) a process as such:
>> /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/GlusterWWW
>> /var/www This process begins with little memory, but then as files are
>> accessed in the volume the memory increases. I setup a script that
>> simply reads the files in the volume one at a time (no writes). It's
>> been running on and off about 12 hours now and the resident memory of
>> the above process is already at 7.5G and continues to grow slowly. If I
>> stop the test script the memory stops growing, but does not reduce.
>> Restart the test script and the memory begins slowly growing again. This
>> is obviously a contrived app environment. With my intended application
>> load it takes about a week or so for the memory to get high enough to
>> invoke the oom killer. Is there potentially something misconfigured
>> here? Thanks, Dan Ragle daniel at Biblestuph.com
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users