[Gluster-users] GlusterFS 9.5 fuse mount excessive memory usage

Tue Feb 8 05:14:00 UTC 2022

Hi,

I've updated the github issue with more details:
https://github.com/gluster/glusterfs/issues/3206#issuecomment-1030770617

Looks like there's a memory leak.

/Z

On Sat, Feb 5, 2022 at 8:45 PM Zakhar Kirpichenko <zakhar at gmail.com> wrote:

> Hi Strahil,
>
> Many thanks for your reply! I've updated the Github issue with statedump
> files taken before and after the tar operation:
> https://github.com/gluster/glusterfs/files/8008635/glusterdump.19102.dump.zip
>
> Please disregard that path= entries are empty, in the original dumps there
> are real paths but I deleted them as they might contain sensitive
> information.
>
> The odd thing is that the dump file is full of:
>
> 1) xlator.performance.write-behind.wb_inode entries, but the tar operation
> does not write to these files. The whole backup process is read-only.
>
> 2) xlator.performance.quick-read.inodectx entries, which never go away.
>
> None of this happens on other clients, which read and write from/to the
> same volume in a much more intense manner.
>
> Best regards,
> Z
>
> On Sat, Feb 5, 2022 at 11:23 AM Strahil Nikolov <hunter86_bg at yahoo.com>
> wrote:
>
>> Can you generate a statedump before and after the tar ?
>> For statedump generation , you can follow
>> https://github.com/gluster/glusterfs/issues/1440#issuecomment-674051243 .
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>
>> В събота, 5 февруари 2022 г., 07:54:22 Гринуич+2, Zakhar Kirpichenko <
>> zakhar at gmail.com> написа:
>>
>>
>> Hi!
>>
>> I opened a Github issue https://github.com/gluster/glusterfs/issues/3206
>> but not sure how much attention they get there, so re-posting here just in
>> case someone has any ideas.
>>
>> Description of problem:
>>
>> GlusterFS 9.5, 3-node cluster (2 bricks + arbiter), an attempt to tar the
>> whole filesystem (35-40 GB, 1.6 million files) on a client succeeds but
>> causes the glusterfs fuse mount process to consume 0.5+ GB of RAM. The
>> usage never goes down after tar exits.
>>
>> The exact command to reproduce the issue:
>>
>> /usr/bin/tar --use-compress-program="/bin/pigz" -cf
>> /path/to/archive.tar.gz --warning=no-file-changed /glusterfsmount
>>
>> The output of the gluster volume info command:
>>
>> Volume Name: gvol1
>> Type: Replicate
>> Volume ID: 0292ac43-89bd-45a4-b91d-799b49613e60
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: 192.168.0.31:/gluster/brick1/gvol1
>> Brick2: 192.168.0.32:/gluster/brick1/gvol1
>> Brick3: 192.168.0.5:/gluster/brick1/gvol1 (arbiter)
>> Options Reconfigured:
>> performance.open-behind: off
>> cluster.readdir-optimize: off
>> cluster.consistent-metadata: on
>> features.cache-invalidation: on
>> diagnostics.count-fop-hits: on
>> diagnostics.latency-measurement: on
>> storage.fips-mode-rchecksum: on
>> performance.cache-size: 256MB
>> client.event-threads: 8
>> server.event-threads: 4
>> storage.reserve: 1
>> performance.cache-invalidation: on
>> cluster.lookup-optimize: on
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: on
>> features.cache-invalidation-timeout: 600
>> performance.md-cache-timeout: 600
>> network.inode-lru-limit: 50000
>> cluster.shd-max-threads: 4
>> cluster.self-heal-window-size: 8
>> performance.enable-least-priority: off
>> performance.cache-max-file-size: 2MB
>>
>> The output of the gluster volume status command:
>>
>> Status of volume: gvol1
>> Gluster process                             TCP Port  RDMA Port  Online
>>  Pid
>>
>> ------------------------------------------------------------------------------
>> Brick 192.168.0.31:/gluster/brick1/gvol1    49152     0          Y
>> 1767
>> Brick 192.168.0.32:/gluster/brick1/gvol1    49152     0          Y
>> 1696
>> Brick 192.168.0.5:/gluster/brick1/gvol1     49152     0          Y
>> 1318
>> Self-heal Daemon on localhost               N/A       N/A        Y
>> 1329
>> Self-heal Daemon on 192.168.0.31            N/A       N/A        Y
>> 1778
>> Self-heal Daemon on 192.168.0.32            N/A       N/A        Y
>> 1707
>>
>> Task Status of Volume gvol1
>>
>> ------------------------------------------------------------------------------
>> There are no active volume tasks
>>
>> The output of the gluster volume heal command:
>>
>> Brick 192.168.0.31:/gluster/brick1/gvol1
>> Status: Connected
>> Number of entries: 0
>>
>> Brick 192.168.0.32:/gluster/brick1/gvol1
>> Status: Connected
>> Number of entries: 0
>>
>> Brick 192.168.0.5:/gluster/brick1/gvol1
>> Status: Connected
>> Number of entries: 0
>>
>> The operating system / glusterfs version:
>>
>> CentOS Linux release 7.9.2009 (Core), fully up to date
>> glusterfs 9.5
>> kernel 3.10.0-1160.53.1.el7.x86_64
>>
>> The logs are basically empty since the last mount except for the
>> mount-related messages.
>>
>> Additional info: a statedump from the client is attached to the Github
>> issue,
>> https://github.com/gluster/glusterfs/files/8004792/glusterdump.18906.dump.1643991007.gz,
>> in case someone wants to have a look.
>>
>> There was also an issue with other clients, running PHP applications with
>> lots of small files, where glusterfs fuse mount process would very quickly
>> balloon to ~2 GB over the course of 24 hours and its performance would slow
>> to a crawl. This happened very consistently with glusterfs 8.x and 9.5, I
>> managed to resolve it at least partially with disabling
>> performance.open-behind: the memory usage either remains consistent or
>> increases at a much slower rate, which is acceptable for this use case.
>>
>> Now the issue remains on this single client, which doesn't do much other
>> than reading and archiving all files from the gluster volume once per day.
>> The glusterfs fuse mount process balloons to 0.5+ GB during the first tar
>> run and remains more or less consistent afterwards, including subsequent
>> tar runs.
>>
>> I would very much appreciate any advice or suggestions.
>>
>> Best regards,
>> Zakhar
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220208/9f069d14/attachment.html>