[Gluster-users] GlusterFS 9.5 fuse mount excessive memory usage

Thu Mar 31 06:57:32 UTC 2022

Hi,

Any news about this? I provided very detailed test results and proof of the
issue https://github.com/gluster/glusterfs/issues/3206 on 6 February 2022
but haven't heard back after that.

Best regards,
Zakhar

On Tue, Feb 8, 2022 at 7:14 AM Zakhar Kirpichenko <zakhar at gmail.com> wrote:

> Hi,
>
> I've updated the github issue with more details:
> https://github.com/gluster/glusterfs/issues/3206#issuecomment-1030770617
>
> Looks like there's a memory leak.
>
> /Z
>
> On Sat, Feb 5, 2022 at 8:45 PM Zakhar Kirpichenko <zakhar at gmail.com>
> wrote:
>
>> Hi Strahil,
>>
>> Many thanks for your reply! I've updated the Github issue with statedump
>> files taken before and after the tar operation:
>> https://github.com/gluster/glusterfs/files/8008635/glusterdump.19102.dump.zip
>>
>> Please disregard that path= entries are empty, in the original dumps
>> there are real paths but I deleted them as they might contain sensitive
>> information.
>>
>> The odd thing is that the dump file is full of:
>>
>> 1) xlator.performance.write-behind.wb_inode entries, but the tar
>> operation does not write to these files. The whole backup process is
>> read-only.
>>
>> 2) xlator.performance.quick-read.inodectx entries, which never go away.
>>
>> None of this happens on other clients, which read and write from/to the
>> same volume in a much more intense manner.
>>
>> Best regards,
>> Z
>>
>> On Sat, Feb 5, 2022 at 11:23 AM Strahil Nikolov <hunter86_bg at yahoo.com>
>> wrote:
>>
>>> Can you generate a statedump before and after the tar ?
>>> For statedump generation , you can follow
>>> https://github.com/gluster/glusterfs/issues/1440#issuecomment-674051243
>>> .
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>>
>>> В събота, 5 февруари 2022 г., 07:54:22 Гринуич+2, Zakhar Kirpichenko <
>>> zakhar at gmail.com> написа:
>>>
>>>
>>> Hi!
>>>
>>> I opened a Github issue https://github.com/gluster/glusterfs/issues/3206
>>> but not sure how much attention they get there, so re-posting here just in
>>> case someone has any ideas.
>>>
>>> Description of problem:
>>>
>>> GlusterFS 9.5, 3-node cluster (2 bricks + arbiter), an attempt to tar
>>> the whole filesystem (35-40 GB, 1.6 million files) on a client succeeds but
>>> causes the glusterfs fuse mount process to consume 0.5+ GB of RAM. The
>>> usage never goes down after tar exits.
>>>
>>> The exact command to reproduce the issue:
>>>
>>> /usr/bin/tar --use-compress-program="/bin/pigz" -cf
>>> /path/to/archive.tar.gz --warning=no-file-changed /glusterfsmount
>>>
>>> The output of the gluster volume info command:
>>>
>>> Volume Name: gvol1
>>> Type: Replicate
>>> Volume ID: 0292ac43-89bd-45a4-b91d-799b49613e60
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 192.168.0.31:/gluster/brick1/gvol1
>>> Brick2: 192.168.0.32:/gluster/brick1/gvol1
>>> Brick3: 192.168.0.5:/gluster/brick1/gvol1 (arbiter)
>>> Options Reconfigured:
>>> performance.open-behind: off
>>> cluster.readdir-optimize: off
>>> cluster.consistent-metadata: on
>>> features.cache-invalidation: on
>>> diagnostics.count-fop-hits: on
>>> diagnostics.latency-measurement: on
>>> storage.fips-mode-rchecksum: on
>>> performance.cache-size: 256MB
>>> client.event-threads: 8
>>> server.event-threads: 4
>>> storage.reserve: 1
>>> performance.cache-invalidation: on
>>> cluster.lookup-optimize: on
>>> transport.address-family: inet
>>> nfs.disable: on
>>> performance.client-io-threads: on
>>> features.cache-invalidation-timeout: 600
>>> performance.md-cache-timeout: 600
>>> network.inode-lru-limit: 50000
>>> cluster.shd-max-threads: 4
>>> cluster.self-heal-window-size: 8
>>> performance.enable-least-priority: off
>>> performance.cache-max-file-size: 2MB
>>>
>>> The output of the gluster volume status command:
>>>
>>> Status of volume: gvol1
>>> Gluster process                             TCP Port  RDMA Port  Online
>>>  Pid
>>>
>>> ------------------------------------------------------------------------------
>>> Brick 192.168.0.31:/gluster/brick1/gvol1    49152     0          Y
>>>   1767
>>> Brick 192.168.0.32:/gluster/brick1/gvol1    49152     0          Y
>>>   1696
>>> Brick 192.168.0.5:/gluster/brick1/gvol1     49152     0          Y
>>>   1318
>>> Self-heal Daemon on localhost               N/A       N/A        Y
>>> 1329
>>> Self-heal Daemon on 192.168.0.31            N/A       N/A        Y
>>> 1778
>>> Self-heal Daemon on 192.168.0.32            N/A       N/A        Y
>>> 1707
>>>
>>> Task Status of Volume gvol1
>>>
>>> ------------------------------------------------------------------------------
>>> There are no active volume tasks
>>>
>>> The output of the gluster volume heal command:
>>>
>>> Brick 192.168.0.31:/gluster/brick1/gvol1
>>> Status: Connected
>>> Number of entries: 0
>>>
>>> Brick 192.168.0.32:/gluster/brick1/gvol1
>>> Status: Connected
>>> Number of entries: 0
>>>
>>> Brick 192.168.0.5:/gluster/brick1/gvol1
>>> Status: Connected
>>> Number of entries: 0
>>>
>>> The operating system / glusterfs version:
>>>
>>> CentOS Linux release 7.9.2009 (Core), fully up to date
>>> glusterfs 9.5
>>> kernel 3.10.0-1160.53.1.el7.x86_64
>>>
>>> The logs are basically empty since the last mount except for the
>>> mount-related messages.
>>>
>>> Additional info: a statedump from the client is attached to the Github
>>> issue,
>>> https://github.com/gluster/glusterfs/files/8004792/glusterdump.18906.dump.1643991007.gz,
>>> in case someone wants to have a look.
>>>
>>> There was also an issue with other clients, running PHP applications
>>> with lots of small files, where glusterfs fuse mount process would very
>>> quickly balloon to ~2 GB over the course of 24 hours and its performance
>>> would slow to a crawl. This happened very consistently with glusterfs 8.x
>>> and 9.5, I managed to resolve it at least partially with disabling
>>> performance.open-behind: the memory usage either remains consistent or
>>> increases at a much slower rate, which is acceptable for this use case.
>>>
>>> Now the issue remains on this single client, which doesn't do much other
>>> than reading and archiving all files from the gluster volume once per day.
>>> The glusterfs fuse mount process balloons to 0.5+ GB during the first tar
>>> run and remains more or less consistent afterwards, including subsequent
>>> tar runs.
>>>
>>> I would very much appreciate any advice or suggestions.
>>>
>>> Best regards,
>>> Zakhar
>>> ________
>>>
>>>
>>>
>>> Community Meeting Calendar:
>>>
>>> Schedule -
>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220331/e9c86a98/attachment.html>