[Bugs] [Bug 1593884] glusterfs-fuse 3.12.9/10 high memory consumption

Mon Jun 25 20:35:14 UTC 2018

https://bugzilla.redhat.com/show_bug.cgi?id=1593884

Marcus Calverley <marcus at calverley.dk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marcus at calverley.dk

--- Comment #2 from Marcus Calverley <marcus at calverley.dk> ---
I have the same issue with steadily increasing client memory usage on 3.12.9
under read/write load. I can add that I'm using GlusterFS as oVirt VM storage,
so I tried switching oVirt to using libgfapi instead of using the fuse mounts,
but that just moved the memory leak to the individual qemu processes instead of
being centralised to the fuse mount processes. That also confirmed that it
seems to be mostly database servers that cause the issue (so the oVirt hosted
engine, and some other PostgreSQL servers I'm running that see a lot of
traffic).

I'm currently working around the issue by migrating VMs manually between
servers in the cluster, but it's annoying to have to do that twice a day to
keep the servers from using 100% memory and VMs getting killed. I'm surprised
there hasn't been more discussion on this issue, is it some setting that we
have that's causing this?

The engine gluster volume configuration:
Volume Name: engine
Type: Replicate
Volume ID: 84f29251-619b-493c-ae1c-7da0fd27a8c1
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 5 = 5
Transport-type: tcp
Bricks:
Brick1: gluster1.management:/gluster/sda/engine
Brick2: gluster2.management:/gluster/sda/engine
Brick3: gluster3.management:/gluster/sda/engine
Brick4: gluster4.management:/gluster/sda/engine
Brick5: gluster5.management:/gluster/sdb/engine
Options Reconfigured:
server.allow-insecure: on
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
network.ping-timeout: 10
storage.owner-uid: 36
storage.owner-gid: 36
performance.flush-behind: on
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
nfs.disable: on
transport.address-family: inet
cluster.server-quorum-ratio: 51%

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.