[Gluster-users] Gluster performance tuning for heavy small file read/writes

Shreyansh Shah shreyansh.shah at alpha-grep.com
Wed Sep 7 08:47:19 UTC 2022


Hi All,
We have set up a new gluster volume, server version 9.3 with clients
running 9.0. In this setup we are facing issues where the clients get
randomly disconnected and there is no relevant log around that time in
client, server and brick logs. One difference we have noticed in this setup
vs the other setups which we have in production is that there are many
small read/writes happening in this.
Wondering if anyone can help with the performance tuning parameters which
we can set/unset to optimize our setup.
Any suggestions are welcome and appreciated.

Adding the current server parameters:


*Option                                   Value*
*------                                   -----*
*cluster.lookup-unhashed                  on (DEFAULT)*
*cluster.lookup-optimize                  on (DEFAULT)*
*cluster.min-free-disk                    10% (DEFAULT)*
*cluster.min-free-inodes                  5% (DEFAULT)*
*cluster.rebalance-stats                  off (DEFAULT)*
*cluster.subvols-per-directory            (null) (DEFAULT)*
*cluster.readdir-optimize                 off (DEFAULT)*
*cluster.rsync-hash-regex                 (null) (DEFAULT)*
*cluster.extra-hash-regex                 (null) (DEFAULT)*
*cluster.dht-xattr-name                   trusted.glusterfs.dht (DEFAULT)*
*cluster.randomize-hash-range-by-gfid     off (DEFAULT)*
*cluster.rebal-throttle                   normal (DEFAULT)*
*cluster.lock-migration                   off*
*cluster.force-migration                  off*
*cluster.local-volume-name                (null) (DEFAULT)*
*cluster.weighted-rebalance               on (DEFAULT)*
*cluster.switch-pattern                   (null) (DEFAULT)*
*cluster.entry-change-log                 on (DEFAULT)*
*cluster.read-subvolume                   (null) (DEFAULT)*
*cluster.read-subvolume-index             -1 (DEFAULT)*
*cluster.read-hash-mode                   1 (DEFAULT)*
*cluster.background-self-heal-count       8 (DEFAULT)*
*cluster.metadata-self-heal               off (DEFAULT)*
*cluster.data-self-heal                   off (DEFAULT)*
*cluster.entry-self-heal                  off (DEFAULT)*
*cluster.self-heal-daemon                 on (DEFAULT)*
*cluster.heal-timeout                     600 (DEFAULT)*
*cluster.self-heal-window-size            8 (DEFAULT)*
*cluster.data-change-log                  on (DEFAULT)*
*cluster.metadata-change-log              on (DEFAULT)*
*cluster.data-self-heal-algorithm         (null) (DEFAULT)*
*cluster.eager-lock                       on (DEFAULT)*
*disperse.eager-lock                      on (DEFAULT)*
*disperse.other-eager-lock                on (DEFAULT)*
*disperse.eager-lock-timeout              1 (DEFAULT)*
*disperse.other-eager-lock-timeout        1 (DEFAULT)*
*cluster.quorum-type                      none (DEFAULT)*
*cluster.quorum-count                     (null) (DEFAULT)*
*cluster.choose-local                     true (DEFAULT)*
*cluster.self-heal-readdir-size           1KB (DEFAULT)*
*cluster.post-op-delay-secs               1 (DEFAULT)*
*cluster.ensure-durability                on (DEFAULT)*
*cluster.consistent-metadata              no (DEFAULT)*
*cluster.heal-wait-queue-length           128 (DEFAULT)*
*cluster.favorite-child-policy            none (DEFAULT)*
*cluster.full-lock                        yes (DEFAULT)*
*cluster.optimistic-change-log            on (DEFAULT)*
*diagnostics.latency-measurement          off*
*diagnostics.dump-fd-stats                off (DEFAULT)*
*diagnostics.count-fop-hits               off*
*diagnostics.brick-log-level              INFO*
*diagnostics.client-log-level             INFO*
*diagnostics.brick-sys-log-level          CRITICAL (DEFAULT)*
*diagnostics.client-sys-log-level         CRITICAL (DEFAULT)*
*diagnostics.brick-logger                 (null) (DEFAULT)*
*diagnostics.client-logger                (null) (DEFAULT)*
*diagnostics.brick-log-format             (null) (DEFAULT)*
*diagnostics.client-log-format            (null) (DEFAULT)*
*diagnostics.brick-log-buf-size           5 (DEFAULT)*
*diagnostics.client-log-buf-size          5 (DEFAULT)*
*diagnostics.brick-log-flush-timeout      120 (DEFAULT)*
*diagnostics.client-log-flush-timeout     120 (DEFAULT)*
*diagnostics.stats-dump-interval          0 (DEFAULT)*
*diagnostics.fop-sample-interval          0 (DEFAULT)*
*diagnostics.stats-dump-format            json (DEFAULT)*
*diagnostics.fop-sample-buf-size          65535 (DEFAULT)*
*diagnostics.stats-dnscache-ttl-sec       86400 (DEFAULT)*
*performance.cache-max-file-size          0 (DEFAULT)*
*performance.cache-min-file-size          0 (DEFAULT)*
*performance.cache-refresh-timeout        1 (DEFAULT)*
*performance.cache-priority                (DEFAULT)*
*performance.io-cache-size                32MB (DEFAULT)*
*performance.cache-size                   256MB*
*performance.io-thread-count              16 (DEFAULT)*
*performance.high-prio-threads            16 (DEFAULT)*
*performance.normal-prio-threads          16 (DEFAULT)*
*performance.low-prio-threads             16 (DEFAULT)*
*performance.least-prio-threads           1 (DEFAULT)*
*performance.enable-least-priority        on (DEFAULT)*
*performance.iot-watchdog-secs            (null) (DEFAULT)*
*performance.iot-cleanup-disconnected-reqs off (DEFAULT)*
*performance.iot-pass-through             false (DEFAULT)*
*performance.io-cache-pass-through        false (DEFAULT)*
*performance.quick-read-cache-size        128MB (DEFAULT)*
*performance.cache-size                   256MB*
*performance.quick-read-cache-timeout     1 (DEFAULT)*
*performance.qr-cache-timeout             600*
*performance.quick-read-cache-invalidation false (DEFAULT)*
*performance.ctime-invalidation           false (DEFAULT)*
*performance.flush-behind                 on (DEFAULT)*
*performance.nfs.flush-behind             on (DEFAULT)*
*performance.write-behind-window-size     1MB (DEFAULT)*
*performance.resync-failed-syncs-after-fsync off (DEFAULT)*
*performance.nfs.write-behind-window-size 1MB (DEFAULT)*
*performance.strict-o-direct              off (DEFAULT)*
*performance.nfs.strict-o-direct          off (DEFAULT)*
*performance.strict-write-ordering        off (DEFAULT)*
*performance.nfs.strict-write-ordering    off (DEFAULT)*
*performance.write-behind-trickling-writes on (DEFAULT)*
*performance.aggregate-size               128KB (DEFAULT)*
*performance.nfs.write-behind-trickling-writes on (DEFAULT)*
*performance.lazy-open                    yes (DEFAULT)*
*performance.read-after-open              yes (DEFAULT)*
*performance.open-behind-pass-through     false (DEFAULT)*
*performance.read-ahead-page-count        4 (DEFAULT)*
*performance.read-ahead-pass-through      false (DEFAULT)*
*performance.readdir-ahead-pass-through   false (DEFAULT)*
*performance.md-cache-pass-through        false (DEFAULT)*
*performance.write-behind-pass-through    false (DEFAULT)*
*performance.md-cache-timeout             1 (DEFAULT)*
*performance.cache-swift-metadata         false (DEFAULT)*
*performance.cache-samba-metadata         false (DEFAULT)*
*performance.cache-capability-xattrs      true (DEFAULT)*
*performance.cache-ima-xattrs             true (DEFAULT)*
*performance.md-cache-statfs              off (DEFAULT)*
*performance.xattr-cache-list              (DEFAULT)*
*performance.nl-cache-pass-through        false (DEFAULT)*
*network.frame-timeout                    1800 (DEFAULT)*
*network.ping-timeout                     42 (DEFAULT)*
*network.tcp-window-size                  (null) (DEFAULT)*
*client.ssl                               off*
*network.remote-dio                       disable (DEFAULT)*
*client.event-threads                     12*
*client.tcp-user-timeout                  0*
*client.keepalive-time                    20000*
*client.keepalive-interval                2000*
*client.keepalive-count                   9*
*client.strict-locks                      off*
*network.tcp-window-size                  (null) (DEFAULT)*
*network.inode-lru-limit                  50000*
*auth.allow                               **
*auth.reject                              (null) (DEFAULT)*
*transport.keepalive                      1*
*server.allow-insecure                    on (DEFAULT)*
*server.root-squash                       off (DEFAULT)*
*server.all-squash                        off (DEFAULT)*
*server.anonuid                           65534 (DEFAULT)*
*server.anongid                           65534 (DEFAULT)*
*server.statedump-path                    /var/run/gluster (DEFAULT)*
*server.outstanding-rpc-limit             128*
*server.ssl                               off*
*auth.ssl-allow                           **
*server.manage-gids                       off (DEFAULT)*
*server.dynamic-auth                      on (DEFAULT)*
*client.send-gids                         on (DEFAULT)*
*server.gid-timeout                       300 (DEFAULT)*
*server.own-thread                        (null) (DEFAULT)*
*server.event-threads                     12*
*server.tcp-user-timeout                  42 (DEFAULT)*
*server.keepalive-time                    20*
*server.keepalive-interval                2*
*server.keepalive-count                   9*
*transport.listen-backlog                 1024*
*ssl.own-cert                             (null) (DEFAULT)*
*ssl.private-key                          (null) (DEFAULT)*
*ssl.ca-list                              (null) (DEFAULT)*
*ssl.crl-path                             (null) (DEFAULT)*
*ssl.certificate-depth                    (null) (DEFAULT)*
*ssl.cipher-list                          (null) (DEFAULT)*
*ssl.dh-param                             (null) (DEFAULT)*
*ssl.ec-curve                             (null) (DEFAULT)*
*transport.address-family                 inet*
*performance.write-behind                 on*
*performance.read-ahead                   off*
*performance.readdir-ahead                off*
*performance.io-cache                     off*
*performance.open-behind                  on*
*performance.quick-read                   on*
*performance.nl-cache                     on*
*performance.stat-prefetch                on*
*performance.client-io-threads            on*
*performance.nfs.write-behind             on*
*performance.nfs.read-ahead               off*
*performance.nfs.io-cache                 off*
*performance.nfs.quick-read               off*
*performance.nfs.stat-prefetch            off*
*performance.nfs.io-threads               off*
*performance.force-readdirp               true (DEFAULT)*
*performance.cache-invalidation           on*
*performance.global-cache-invalidation    true (DEFAULT)*
*features.uss                             off*
*features.snapshot-directory              .snaps*
*features.show-snapshot-directory         off*
*features.tag-namespaces                  off*
*network.compression                      off*
*network.compression.window-size          -15 (DEFAULT)*
*network.compression.mem-level            8 (DEFAULT)*
*network.compression.min-size             0 (DEFAULT)*
*network.compression.compression-level    -1 (DEFAULT)*
*network.compression.debug                false (DEFAULT)*
*features.default-soft-limit              80% (DEFAULT)*
*features.soft-timeout                    60 (DEFAULT)*
*features.hard-timeout                    5 (DEFAULT)*
*features.alert-time                      86400 (DEFAULT)*
*features.quota-deem-statfs               off*
*geo-replication.indexing                 off*
*geo-replication.indexing                 off*
*geo-replication.ignore-pid-check         off*
*geo-replication.ignore-pid-check         off*
*features.quota                           off*
*features.inode-quota                     off*
*features.bitrot                          disable*
*debug.trace                              off*
*debug.log-history                        no (DEFAULT)*
*debug.log-file                           no (DEFAULT)*
*debug.exclude-ops                        (null) (DEFAULT)*
*debug.include-ops                        (null) (DEFAULT)*
*debug.error-gen                          off*
*debug.error-failure                      (null) (DEFAULT)*
*debug.error-number                       (null) (DEFAULT)*
*debug.random-failure                     off (DEFAULT)*
*debug.error-fops                         (null) (DEFAULT)*
*nfs.disable                              on*
*features.read-only                       off (DEFAULT)*
*features.worm                            off*
*features.worm-file-level                 off*
*features.worm-files-deletable            on*
*features.default-retention-period        120 (DEFAULT)*
*features.retention-mode                  relax (DEFAULT)*
*features.auto-commit-period              180 (DEFAULT)*
*storage.linux-aio                        off (DEFAULT)*
*storage.linux-io_uring                   off (DEFAULT)*
*storage.batch-fsync-mode                 reverse-fsync (DEFAULT)*
*storage.batch-fsync-delay-usec           0 (DEFAULT)*
*storage.owner-uid                        -1 (DEFAULT)*
*storage.owner-gid                        -1 (DEFAULT)*
*storage.node-uuid-pathinfo               off (DEFAULT)*
*storage.health-check-interval            30 (DEFAULT)*
*storage.build-pgfid                      off (DEFAULT)*
*storage.gfid2path                        on (DEFAULT)*
*storage.gfid2path-separator              : (DEFAULT)*
*storage.reserve                          1 (DEFAULT)*
*storage.health-check-timeout             20 (DEFAULT)*
*storage.fips-mode-rchecksum              off*
*storage.force-create-mode                0000 (DEFAULT)*
*storage.force-directory-mode             0000 (DEFAULT)*
*storage.create-mask                      0777 (DEFAULT)*
*storage.create-directory-mask            0777 (DEFAULT)*
*storage.max-hardlinks                    100 (DEFAULT)*
*features.ctime                           on (DEFAULT)*
*config.gfproxyd                          off**cluster.server-quorum-type
            off *
*cluster.server-quorum-ratio              51*
*changelog.changelog                      off (DEFAULT)*
*changelog.changelog-dir                  {{ brick.path
}}/.glusterfs/changelogs (DEFAULT)*
*changelog.encoding                       ascii (DEFAULT)*
*changelog.rollover-time                  15 (DEFAULT)*
*changelog.fsync-interval                 5 (DEFAULT)*
*changelog.changelog-barrier-timeout      120*
*changelog.capture-del-path               off (DEFAULT)*
*features.barrier                         disable*
*features.barrier-timeout                 120*
*features.trash                           off (DEFAULT)*
*features.trash-dir                       .trashcan (DEFAULT)*
*features.trash-eliminate-path            (null) (DEFAULT)*
*features.trash-max-filesize              5MB (DEFAULT)*
*features.trash-internal-op               off (DEFAULT)*
*cluster.enable-shared-storage            disable*
*locks.trace                              off (DEFAULT)*
*locks.mandatory-locking                  off (DEFAULT)*
*cluster.disperse-self-heal-daemon        enable (DEFAULT)*
*cluster.quorum-reads                     no (DEFAULT)*
*client.bind-insecure                     (null) (DEFAULT)*
*features.shard                           off*
*features.shard-block-size                64MB (DEFAULT)*
*features.shard-lru-limit                 16384 (DEFAULT)*
*features.shard-deletion-rate             100 (DEFAULT)*
*features.scrub-throttle                  lazy*
*features.scrub-freq                      biweekly*
*features.scrub                           false (DEFAULT)*
*features.expiry-time                     120*
*features.signer-threads                  4*
*features.cache-invalidation              on*
*features.cache-invalidation-timeout      600*
*ganesha.enable                           off*
*features.leases                          off*
*features.lease-lock-recall-timeout       60 (DEFAULT)*
*disperse.background-heals                8 (DEFAULT)*
*disperse.heal-wait-qlength               128 (DEFAULT)*
*cluster.heal-timeout                     600 (DEFAULT)*
*dht.force-readdirp                       on (DEFAULT)*
*disperse.read-policy                     gfid-hash (DEFAULT)*
*cluster.shd-max-threads                  1 (DEFAULT)*
*cluster.shd-wait-qlength                 1024 (DEFAULT)*
*cluster.locking-scheme                   full (DEFAULT)*
*cluster.granular-entry-heal              no (DEFAULT)*
*features.locks-revocation-secs           0 (DEFAULT)*
*features.locks-revocation-clear-all      false (DEFAULT)*
*features.locks-revocation-max-blocked    0 (DEFAULT)*
*features.locks-monkey-unlocking          false (DEFAULT)*
*features.locks-notify-contention         yes (DEFAULT)*
*features.locks-notify-contention-delay   5 (DEFAULT)*
*disperse.shd-max-threads                 1 (DEFAULT)*
*disperse.shd-wait-qlength                1024 (DEFAULT)*
*disperse.cpu-extensions                  auto (DEFAULT)*
*disperse.self-heal-window-size           32 (DEFAULT)*
*cluster.use-compound-fops                off*
*performance.parallel-readdir             on*
*performance.rda-request-size             131072*
*performance.rda-low-wmark                4096 (DEFAULT)*
*performance.rda-high-wmark               128KB (DEFAULT)*
*performance.rda-cache-limit              10MB*
*performance.nl-cache-positive-entry      false (DEFAULT)*
*performance.nl-cache-limit               10MB*
*performance.nl-cache-timeout             600*
*cluster.brick-multiplex                  disable*
*glusterd.vol_count_per_thread            100*
*cluster.max-bricks-per-process           250*
*disperse.optimistic-change-log           on (DEFAULT)*
*disperse.stripe-cache                    4 (DEFAULT)*
*cluster.halo-enabled                     False (DEFAULT)*
*cluster.halo-shd-max-latency             99999 (DEFAULT)*
*cluster.halo-nfsd-max-latency            5 (DEFAULT)*
*cluster.halo-max-latency                 5 (DEFAULT)*
*cluster.halo-max-replicas                99999 (DEFAULT)*
*cluster.halo-min-replicas                2 (DEFAULT)*
*features.selinux                         on*
*cluster.daemon-log-level                 INFO*
*debug.delay-gen                          off*
*delay-gen.delay-percentage               10% (DEFAULT)*
*delay-gen.delay-duration                 100000 (DEFAULT)*
*delay-gen.enable                          (DEFAULT)*
*disperse.parallel-writes                 on (DEFAULT)*
*disperse.quorum-count                    0 (DEFAULT)*
*features.sdfs                            off*
*features.cloudsync                       off*
*features.ctime                           on*
*ctime.noatime                            on*
*features.cloudsync-storetype             (null) (DEFAULT)*
*features.enforce-mandatory-lock          off*
*config.global-threading                  off*
*config.client-threads                    16*
*config.brick-threads                     16*
*features.cloudsync-remote-read           off*
*features.cloudsync-store-id              (null) (DEFAULT)*
*features.cloudsync-product-id            (null) (DEFAULT)*
*features.acl                             enable**cluster.use-anonymous-inode
             yes*



-- 
Regards,
Shreyansh Shah
AlphaGrep* Securities Pvt. Ltd.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220907/05d42cf0/attachment.html>


More information about the Gluster-users mailing list