[Gluster-users] Help analise statedumps

Pedro Costa pedro at pmc.digital
Mon Mar 18 09:17:43 UTC 2019


Sorry to revive old thread, but just to let you know that with the latest 5.4 version this has virtually stopped happening.

I can’t ascertain for sure yet, but since the update the memory footprint of Gluster has been massively reduced.

Thanks to everyone, great job.


From: Pedro Costa
Sent: 04 February 2019 11:28
To: 'Sanju Rakonde' <srakonde at redhat.com>
Cc: 'gluster-users' <gluster-users at gluster.org>
Subject: RE: [Gluster-users] Help analise statedumps

Hi Sanju,

If it helps, here’s also a statedump (taken just now) since the reboot’s:


Many thanks,

From: Pedro Costa
Sent: 04 February 2019 10:12
To: 'Sanju Rakonde' <srakonde at redhat.com<mailto:srakonde at redhat.com>>
Cc: gluster-users <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
Subject: RE: [Gluster-users] Help analise statedumps

Hi Sanju,

The process was `glusterfs`, yes I took the statedump for the same process (different PID since it was rebooted).


From: Sanju Rakonde <srakonde at redhat.com<mailto:srakonde at redhat.com>>
Sent: 04 February 2019 06:10
To: Pedro Costa <pedro at pmc.digital<mailto:pedro at pmc.digital>>
Cc: gluster-users <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
Subject: Re: [Gluster-users] Help analise statedumps


Can you please specify which process has leak? Have you took the statedump of the same process which has leak?


On Sat, Feb 2, 2019 at 3:15 PM Pedro Costa <pedro at pmc.digital<mailto:pedro at pmc.digital>> wrote:

I have a 3x replicated cluster running 4.1.7 on ubuntu 16.04.5, all 3 replicas are also clients hosting a Node.js/Nginx web server.

The current configuration is as such:

Volume Name: gvol1
Type: Replicate
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: vm000000:/srv/brick1/gvol1
Brick2: vm000001:/srv/brick1/gvol1
Brick3: vm000002:/srv/brick1/gvol1
Options Reconfigured:
cluster.self-heal-readdir-size: 2KB
cluster.self-heal-window-size: 2
cluster.background-self-heal-count: 20
network.ping-timeout: 5
disperse.eager-lock: off
performance.parallel-readdir: on
performance.readdir-ahead: on
performance.rda-cache-limit: 128MB
performance.cache-refresh-timeout: 10
performance.nl-cache-timeout: 600
performance.nl-cache: on
cluster.nufa: on
performance.enable-least-priority: off
server.outstanding-rpc-limit: 128
performance.strict-o-direct: on
cluster.shd-max-threads: 12
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 90000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.cache-samba-metadata: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
features.utime: on
storage.ctime: on
server.event-threads: 4
performance.cache-size: 256MB
performance.read-ahead: on
cluster.readdir-optimize: on
cluster.strict-readdir: on
performance.io-thread-count: 8
server.allow-insecure: on
cluster.read-hash-mode: 0
cluster.lookup-unhashed: auto
cluster.choose-local: on

I believe there’s a memory leak somewhere, it just keeps going up until it hangs one or more nodes taking the whole cluster down sometimes.

I have taken 2 statedumps on one of the nodes, one where the memory is too high and another just after a reboot with the app running and the volume fully healed.

https://pmcdigital.sharepoint.com/:u:/g/EYDsNqTf1UdEuE6B0ZNVPfIBf_I-AbaqHotB1lJOnxLlTg?e=boYP09 (high memory)

https://pmcdigital.sharepoint.com/:u:/g/EWZBsnET2xBHl6OxO52RCfIBvQ0uIDQ1GKJZ1GrnviyMhg?e=wI3yaY  (after reboot)

Any help would be greatly appreciated,

Kindest Regards,

Pedro Maia Costa
Senior Developer, pmc.digital
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190318/c2665b02/attachment.html>

More information about the Gluster-users mailing list