[Gluster-users] poor performance
Jaco Kroon
jaco at uls.co.za
Wed Dec 14 11:28:02 UTC 2022
Hi All,
We've got a glusterfs cluster that houses some php web sites.
This is generally considered a bad idea and we can see why.
With performance.nl-cache on it actually turns out to be very
reasonable, however, with this turned of performance is roughly 5x
worse. meaning a request that would take sub 500ms now takes 2500ms.
In other cases we see far, far worse cases, eg, with nl-cache takes
~1500ms, without takes ~30s (20x worse).
So why not use nl-cache? Well, it results in readdir reporting files
which then fails to open with ENOENT. The cache also never clears even
though the configuration says nl-cache entries should only be cached for
60s. Even for "ls -lah" in affected folders you'll notice ???? mark
entries for attributes on files. If this recovers in a reasonable time
(say, a few seconds, sure).
# gluster volume info
Type: Replicate
Volume ID: cbe08331-8b83-41ac-b56d-88ef30c0f5c7
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Options Reconfigured:
performance.nl-cache: on
cluster.readdir-optimize: on
config.client-threads: 2
config.brick-threads: 4
config.global-threading: on
performance.iot-pass-through: on
storage.fips-mode-rchecksum: on
cluster.granular-entry-heal: enable
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
client.event-threads: 2
server.event-threads: 2
transport.address-family: inet
nfs.disable: on
cluster.metadata-self-heal: off
cluster.entry-self-heal: off
cluster.data-self-heal: off
cluster.self-heal-daemon: on
server.allow-insecure: on
features.ctime: off
performance.io-cache: on
performance.cache-invalidation: on
features.cache-invalidation: on
performance.qr-cache-timeout: 600
features.cache-invalidation-timeout: 600
performance.io-cache-size: 128MB
performance.cache-size: 128MB
Are there any other recommendations short of abandon all hope of
redundancy and to revert to a single-server setup (for the web code at
least). Currently the cost of the redundancy seems to outweigh the benefit.
Glusterfs version 10.2. With patch for --inode-table-size, mounts
happen with:
/usr/sbin/glusterfs --acl --reader-thread-count=2 --lru-limit=524288
--inode-table-size=524288 --invalidate-limit=16 --background-qlen=32
--fuse-mountopts=nodev,nosuid,noexec,noatime --process-name fuse
--volfile-server=127.0.0.1 --volfile-id=gv_home
--fuse-mountopts=nodev,nosuid,noexec,noatime /home
Kind Regards,
Jaco
More information about the Gluster-users
mailing list