[Gluster-users] Poor performance compared to Netapp NAS with small files

Vlad Kopylov vladkopy at gmail.com
Sun Sep 23 02:14:03 UTC 2018


Here is what I have for small files.  I don't think you really need much
for git

Options Reconfigured:
performance.io-thread-count: 8
server.allow-insecure: on
cluster.shd-max-threads: 12
performance.rda-cache-limit: 128MB
cluster.readdir-optimize: on
cluster.read-hash-mode: 0
performance.strict-o-direct: on
cluster.lookup-unhashed: auto
performance.nl-cache: on
performance.nl-cache-timeout: 600
cluster.lookup-optimize: on
client.event-threads: 4
performance.client-io-threads: on
performance.md-cache-timeout: 600
server.event-threads: 4
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
network.inode-lru-limit: 90000
performance.cache-refresh-timeout: 10
performance.enable-least-priority: off
performance.cache-size: 2GB
cluster.nufa: on
cluster.choose-local: on


On Tue, Sep 18, 2018 at 6:48 AM, Nicolas <nicolas at furyweb.fr> wrote:

> Hello,
>
> I have very bad performance with glusterFS 3.12.14 with small files
> especially when working with git repositories.
>
> Here is my configuration :
> 3 nodes gluster (VMware guest v13 on vSphere 6.5 hosted by Gen8 blades
> attached to 3PAR SSD RAID5 LUNs), gluster volume type replica 3 with
> arbiter, SSL enabled, NFS disabled, heartbeat IP between both main nodes.
> Trusted storage pool on Debian 9 x64
> Client on Debian 8 x64 with native gluster client
> Network bandwith verified with iperf between client and each storage node
> (~900Mb/s)
> Disk bandwith verified with dd on each storage node (~90MB/s)
> _____________________________________________________________
> Volume Name: perftest
> Type: Replicate
> Volume ID: c60b3744-7955-4058-b276-69d7b97de8aa
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: glusterVM1:/bricks/perftest/brick1/data
> Brick2: glusterVM2:/bricks/perftest/brick1/data
> Brick3: glusterVM3:/bricks/perftest/brick1/data (arbiter)
> Options Reconfigured:
> cluster.data-self-heal-algorithm: full
> features.trash: off
> diagnostics.client-log-level: ERROR
> ssl.cipher-list: HIGH:!SSLv2
> server.ssl: on
> client.ssl: on
> transport.address-family: inet
> nfs.disable: on
> _____________________________________________________________
>
> I made a test script that try several parameters but every test gives
> similar measures (except for performance.write-behind), ~30s average for a
> git clone that take only 3s on NAS volume.
> _____________________________________________________________
> #!/bin/bash
>
> trap "[ -d /mnt/project ] && rm -rf /mnt/project; grep -q /mnt
> /proc/mounts && umount /mnt; exit" 2
>
> LOG=$(mktemp)
> for params in \
>   "server.event-threads 5" \
> "client.event-threads 5" \
> "cluster.lookup-optimize on" \
> "cluster.readdir-optimize on" \
> "features.cache-invalidation on" \
> "features.cache-invalidation-timeout 5" \
> "performance.cache-invalidation on" \
> "performance.cache-refresh-timeout 5" \
> "performance.client-io-threads on" \
> "performance.flush-behind on" \
> "performance.io-thread-count 6" \
> "performance.quick-read on" \
> "performance.read-ahead enable" \
> "performance.readdir-ahead enable" \
> "performance.stat-prefetch on" \
> "performance.write-behind on" \
> "performance.write-behind-window-size 2MB"; do
>   set $params
>   echo -n "gluster volume set perftest $1 $2 -> "
>   ssh -n glusterVM3 "gluster volume set perftest $1 $2"
> done
> echo "NAS Reference"
> sh -c "time -o $LOG -f '%E %P' git clone git at gitlab.local:grp/project.git
> /share/nas >/dev/null 2>&1"
> cat $LOG
> rm -rf /share/nas/project
>
> for params in \
>   "server.event-threads 5 6 7" \
>   "client.event-threads 5 6 7" \
>   "cluster.lookup-optimize on off on" \
>   "cluster.readdir-optimize on off on" \
>   "features.cache-invalidation on off on" \
>   "features.cache-invalidation-timeout 5 10 15 20 30 45 60 90 120" \
>   "performance.cache-invalidation on off on" \
>   "performance.cache-refresh-timeout 1 5 10 15 20 30 45 60" \
>   "performance.client-io-threads on off on" \
>   "performance.flush-behind on off on" \
>   "performance.io-thread-count 6 7 8 9 10" \
>   "performance.quick-read on off on" \
>   "performance.read-ahead enable disable enable" \
>   "performance.readdir-ahead enable disable enable" \
>   "performance.stat-prefetch on off on" \
>   "performance.write-behind on off on" \
>   "performance.write-behind-window-size 2MB 4MB 8MB 16MB"; do
>   set $params
>   param=$1
>   shift
>   for value in $*; do
>     echo -en "\nTesting $param=$value -> "
>     #ssh -n glusterVM3 "yes | gluster volume stop perftest force; gluster
> volume set perftest $param $value; gluster volume start perftest"
>     ssh -n glusterVM3 "gluster volume set perftest $param $value"
>     if mount -t glusterfs -o defaults,direct-io-mode=enable
> glusterVMa:perftest /mnt; then
>       for i in $(seq 1 5); do
>         sh -c "time -o $LOG -f '%E %P' git clone git at gitlab.local
> :grp/project.git /mnt/bench >/dev/null 2>&1"
>         cat $LOG
>         rm -rf /mnt/bench
>       done
>       umount /mnt
>     else
>       echo "*** FAIL"
>       exit
>     fi
>   done
> done
>
> rm $LOG
> _____________________________________________________________
>
> Output produced by the script
> _____________________________________________________________
> gluster volume set perftest server.event-threads 5 -> volume set: success
> gluster volume set perftest client.event-threads 5 -> volume set: success
> gluster volume set perftest cluster.lookup-optimize on -> volume set:
> success
> gluster volume set perftest cluster.readdir-optimize on -> volume set:
> success
> gluster volume set perftest features.cache-invalidation on -> volume set:
> success
> gluster volume set perftest features.cache-invalidation-timeout 5 ->
> volume set: success
> gluster volume set perftest performance.cache-invalidation on -> volume
> set: success
> gluster volume set perftest performance.cache-refresh-timeout 5 -> volume
> set: success
> gluster volume set perftest performance.client-io-threads on -> volume
> set: success
> gluster volume set perftest performance.flush-behind on -> volume set:
> success
> gluster volume set perftest performance.io-thread-count 6 -> volume set:
> success
> gluster volume set perftest performance.quick-read on -> volume set:
> success
> gluster volume set perftest performance.read-ahead enable -> volume set:
> success
> gluster volume set perftest performance.readdir-ahead enable -> volume
> set: success
> gluster volume set perftest performance.stat-prefetch on -> volume set:
> success
> gluster volume set perftest performance.write-behind on -> volume set:
> success
> gluster volume set perftest performance.write-behind-window-size 2MB ->
> volume set: success
> NAS Reference
> 0:03.59 23%
>
> Testing server.event-threads=5 -> volume set: success
> 0:29.45 2%
> 0:27.07 2%
> 0:24.89 2%
> 0:24.93 2%
> 0:24.64 3%
>
> Testing server.event-threads=6 -> volume set: success
> 0:24.14 3%
> 0:24.69 2%
> 0:26.81 2%
> 0:27.38 2%
> 0:25.59 2%
>
> Testing server.event-threads=7 -> volume set: success
> 0:25.34 2%
> 0:24.14 2%
> 0:25.92 2%
> 0:23.62 2%
> 0:24.76 2%
>
> Testing client.event-threads=5 -> volume set: success
> 0:24.60 3%
> 0:29.40 2%
> 0:34.78 2%
> 0:33.99 2%
> 0:33.54 2%
>
> Testing client.event-threads=6 -> volume set: success
> 0:23.82 3%
> 0:24.64 2%
> 0:26.10 3%
> 0:24.56 2%
> 0:28.21 2%
>
> Testing client.event-threads=7 -> volume set: success
> 0:28.15 2%
> 0:35.19 2%
> 0:24.03 2%
> 0:24.79 2%
> 0:26.55 2%
>
> Testing cluster.lookup-optimize=on -> volume set: success
> 0:30.67 2%
> 0:30.49 2%
> 0:31.52 2%
> 0:33.13 2%
> 0:32.41 2%
>
> Testing cluster.lookup-optimize=off -> volume set: success
> 0:25.82 2%
> 0:25.59 2%
> 0:28.24 2%
> 0:31.90 2%
> 0:33.52 2%
>
> Testing cluster.lookup-optimize=on -> volume set: success
> 0:29.33 2%
> 0:24.82 2%
> 0:25.93 2%
> 0:25.36 2%
> 0:24.89 2%
>
> Testing cluster.readdir-optimize=on -> volume set: success
> 0:24.98 2%
> 0:25.03 2%
> 0:27.47 2%
> 0:28.13 2%
> 0:27.41 2%
>
> Testing cluster.readdir-optimize=off -> volume set: success
> 0:32.54 2%
> 0:32.50 2%
> 0:25.56 2%
> 0:25.21 2%
> 0:27.39 2%
>
> Testing cluster.readdir-optimize=on -> volume set: success
> 0:27.68 2%
> 0:29.33 2%
> 0:25.50 2%
> 0:25.17 2%
> 0:26.00 2%
>
> Testing features.cache-invalidation=on -> volume set: success
> 0:25.63 2%
> 0:25.46 3%
> 0:25.55 3%
> 0:26.13 2%
> 0:25.13 2%
>
> Testing features.cache-invalidation=off -> volume set: success
> 0:27.79 2%
> 0:25.31 2%
> 0:24.75 2%
> 0:27.75 2%
> 0:32.67 2%
>
> Testing features.cache-invalidation=on -> volume set: success
> 0:26.34 2%
> 0:26.60 2%
> 0:26.32 2%
> 0:31.05 3%
> 0:33.58 2%
>
> Testing features.cache-invalidation-timeout=5 -> volume set: success
> 0:25.89 3%
> 0:25.07 3%
> 0:25.49 2%
> 0:25.44 3%
> 0:25.47 2%
>
> Testing features.cache-invalidation-timeout=10 -> volume set: success
> 0:32.34 2%
> 0:28.27 3%
> 0:27.41 2%
> 0:25.17 2%
> 0:25.56 2%
>
> Testing features.cache-invalidation-timeout=15 -> volume set: success
> 0:27.79 2%
> 0:30.58 2%
> 0:31.63 2%
> 0:26.71 2%
> 0:29.69 2%
>
> Testing features.cache-invalidation-timeout=20 -> volume set: success
> 0:26.62 2%
> 0:23.76 3%
> 0:24.17 3%
> 0:24.99 2%
> 0:25.31 2%
>
> Testing features.cache-invalidation-timeout=30 -> volume set: success
> 0:25.75 3%
> 0:27.34 2%
> 0:28.38 2%
> 0:27.15 2%
> 0:30.91 2%
>
> Testing features.cache-invalidation-timeout=45 -> volume set: success
> 0:24.77 2%
> 0:24.81 2%
> 0:28.22 2%
> 0:32.56 2%
> 0:40.81 1%
>
> Testing features.cache-invalidation-timeout=60 -> volume set: success
> 0:31.97 2%
> 0:27.14 2%
> 0:24.53 3%
> 0:25.48 3%
> 0:25.27 3%
>
> Testing features.cache-invalidation-timeout=90 -> volume set: success
> 0:25.24 3%
> 0:26.83 3%
> 0:32.74 2%
> 0:26.82 3%
> 0:27.69 2%
>
> Testing features.cache-invalidation-timeout=120 -> volume set: success
> 0:24.50 3%
> 0:25.43 3%
> 0:26.21 3%
> 0:30.09 2%
> 0:32.24 2%
>
> Testing performance.cache-invalidation=on -> volume set: success
> 0:28.77 3%
> 0:37.16 2%
> 0:42.56 1%
> 0:26.21 2%
> 0:27.91 3%
>
> Testing performance.cache-invalidation=off -> volume set: success
> 0:31.05 2%
> 0:34.40 2%
> 0:33.90 2%
> 0:33.12 2%
> 0:27.84 3%
>
> Testing performance.cache-invalidation=on -> volume set: success
> 0:27.17 3%
> 0:26.73 3%
> 0:24.61 3%
> 0:26.36 3%
> 0:39.90 2%
>
> Testing performance.cache-refresh-timeout=1 -> volume set: success
> 0:26.83 3%
> 0:36.17 2%
> 0:31.37 2%
> 0:26.12 3%
> 0:26.46 2%
>
> Testing performance.cache-refresh-timeout=5 -> volume set: success
> 0:24.95 3%
> 0:27.33 3%
> 0:30.77 2%
> 0:26.77 3%
> 0:34.62 2%
>
> Testing performance.cache-refresh-timeout=10 -> volume set: success
> 0:29.36 2%
> 0:26.04 3%
> 0:26.21 3%
> 0:29.47 3%
> 0:28.67 3%
>
> Testing performance.cache-refresh-timeout=15 -> volume set: success
> 0:29.26 3%
> 0:27.31 3%
> 0:27.15 3%
> 0:29.74 3%
> 0:32.70 2%
>
> Testing performance.cache-refresh-timeout=20 -> volume set: success
> 0:27.99 3%
> 0:30.13 2%
> 0:29.39 3%
> 0:28.59 3%
> 0:31.30 3%
>
> Testing performance.cache-refresh-timeout=30 -> volume set: success
> 0:27.47 3%
> 0:26.68 3%
> 0:27.09 3%
> 0:27.08 3%
> 0:31.72 3%
>
> Testing performance.cache-refresh-timeout=45 -> volume set: success
> 0:28.83 3%
> 0:29.21 3%
> 0:38.75 2%
> 0:26.15 3%
> 0:26.76 3%
>
> Testing performance.cache-refresh-timeout=60 -> volume set: success
> 0:29.64 2%
> 0:29.71 2%
> 0:31.41 2%
> 0:28.35 3%
> 0:26.26 3%
>
> Testing performance.client-io-threads=on -> volume set: success
> 0:25.14 3%
> 0:26.64 3%
> 0:26.43 3%
> 0:25.63 3%
> 0:27.89 3%
>
> Testing performance.client-io-threads=off -> volume set: success
> 0:31.37 2%
> 0:33.65 2%
> 0:28.85 3%
> 0:28.27 3%
> 0:26.90 3%
>
> Testing performance.client-io-threads=on -> volume set: success
> 0:26.12 3%
> 0:25.92 3%
> 0:28.30 3%
> 0:39.20 2%
> 0:28.45 3%
>
> Testing performance.flush-behind=on -> volume set: success
> 0:34.83 2%
> 0:27.33 3%
> 0:31.30 2%
> 0:26.40 3%
> 0:27.49 2%
>
> Testing performance.flush-behind=off -> volume set: success
> 0:30.64 2%
> 0:31.60 2%
> 0:33.22 2%
> 0:25.67 2%
> 0:26.85 3%
>
> Testing performance.flush-behind=on -> volume set: success
> 0:26.75 3%
> 0:26.67 3%
> 0:30.52 3%
> 0:38.60 2%
> 0:34.69 3%
>
> Testing performance.io-thread-count=6 -> volume set: success
> 0:30.87 2%
> 0:34.27 2%
> 0:34.08 2%
> 0:28.70 2%
> 0:32.83 2%
>
> Testing performance.io-thread-count=7 -> volume set: success
> 0:32.14 2%
> 0:43.08 1%
> 0:31.79 2%
> 0:25.93 3%
> 0:26.82 2%
>
> Testing performance.io-thread-count=8 -> volume set: success
> 0:29.89 2%
> 0:28.69 2%
> 0:34.19 2%
> 0:40.00 1%
> 0:37.42 2%
>
> Testing performance.io-thread-count=9 -> volume set: success
> 0:26.50 3%
> 0:26.99 2%
> 0:27.05 2%
> 0:32.22 2%
> 0:31.63 2%
>
> Testing performance.io-thread-count=10 -> volume set: success
> 0:29.13 2%
> 0:30.60 2%
> 0:25.19 2%
> 0:24.28 3%
> 0:25.40 3%
>
> Testing performance.quick-read=on -> volume set: success
> 0:26.40 3%
> 0:27.37 2%
> 0:28.03 2%
> 0:28.07 2%
> 0:33.47 2%
>
> Testing performance.quick-read=off -> volume set: success
> 0:30.99 2%
> 0:27.16 2%
> 0:25.34 3%
> 0:27.58 3%
> 0:27.67 3%
>
> Testing performance.quick-read=on -> volume set: success
> 0:27.37 2%
> 0:26.99 3%
> 0:29.78 2%
> 0:26.06 2%
> 0:25.67 2%
>
> Testing performance.read-ahead=enable -> volume set: success
> 0:24.52 3%
> 0:26.05 2%
> 0:32.37 2%
> 0:30.27 2%
> 0:25.70 3%
>
> Testing performance.read-ahead=disable -> volume set: success
> 0:26.98 3%
> 0:25.54 3%
> 0:25.55 3%
> 0:30.78 2%
> 0:28.07 2%
>
> Testing performance.read-ahead=enable -> volume set: success
> 0:30.34 2%
> 0:33.93 2%
> 0:30.26 2%
> 0:28.18 2%
> 0:27.06 3%
>
> Testing performance.readdir-ahead=enable -> volume set: success
> 0:26.31 3%
> 0:25.64 3%
> 0:31.97 2%
> 0:30.75 2%
> 0:26.10 3%
>
> Testing performance.readdir-ahead=disable -> volume set: success
> 0:27.50 3%
> 0:27.19 3%
> 0:27.67 3%
> 0:26.99 3%
> 0:28.25 3%
>
> Testing performance.readdir-ahead=enable -> volume set: success
> 0:34.94 2%
> 0:30.43 2%
> 0:27.14 3%
> 0:27.81 2%
> 0:26.36 3%
>
> Testing performance.stat-prefetch=on -> volume set: success
> 0:28.55 3%
> 0:27.10 2%
> 0:26.64 3%
> 0:30.84 3%
> 0:35.45 2%
>
> Testing performance.stat-prefetch=off -> volume set: success
> 0:29.12 3%
> 0:36.54 2%
> 0:26.32 3%
> 0:29.02 3%
> 0:27.16 3%
>
> Testing performance.stat-prefetch=on -> volume set: success
> 0:31.17 2%
> 0:34.64 2%
> 0:26.50 3%
> 0:30.39 2%
> 0:27.12 3%
>
> Testing performance.write-behind=on -> volume set: success
> 0:29.77 2%
> 0:28.00 2%
> 0:28.98 3%
> 0:29.83 3%
> 0:28.87 3%
>
> Testing performance.write-behind=off -> volume set: success
> 1:11.95 1%
> 1:06.03 1%
> 1:07.70 1%
> 1:30.21 1%
> 1:08.47 1%
>
> Testing performance.write-behind=on -> volume set: success
> 0:30.14 2%
> 0:28.99 2%
> 0:34.51 2%
> 0:32.60 2%
> 0:30.54 2%
>
> Testing performance.write-behind-window-size=2MB -> volume set: success
> 0:24.74 3%
> 0:25.71 2%
> 0:27.49 2%
> 0:25.78 3%
> 0:26.35 3%
>
> Testing performance.write-behind-window-size=4MB -> volume set: success
> 0:34.21 2%
> 0:27.31 3%
> 0:28.83 2%
> 0:28.91 2%
> 0:25.73 3%
>
> Testing performance.write-behind-window-size=8MB -> volume set: success
> 0:24.41 3%
> 0:26.23 2%
> 0:25.20 3%
> 0:26.00 2%
> 0:27.04 2%
>
> Testing performance.write-behind-window-size=16MB -> volume set: success
> 0:27.92 2%
> 0:24.69 2%
> 0:24.67 2%
> 0:24.13 2%
> 0:23.55 3%
> _____________________________________________________________
>
> If someone has an idea to significantly improve performance I'll be very
> interested.
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180922/53adf657/attachment.html>


More information about the Gluster-users mailing list