[Gluster-users] Poor performance compared to Netapp NAS with small files

Vlad Kopylov vladkopy at gmail.com
Sun Sep 23 15:16:35 UTC 2018


Forgot mount options for small files
defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5

On Sat, Sep 22, 2018 at 10:14 PM, Vlad Kopylov <vladkopy at gmail.com> wrote:

> Here is what I have for small files.  I don't think you really need much
> for git
>
> Options Reconfigured:
> performance.io-thread-count: 8
> server.allow-insecure: on
> cluster.shd-max-threads: 12
> performance.rda-cache-limit: 128MB
> cluster.readdir-optimize: on
> cluster.read-hash-mode: 0
> performance.strict-o-direct: on
> cluster.lookup-unhashed: auto
> performance.nl-cache: on
> performance.nl-cache-timeout: 600
> cluster.lookup-optimize: on
> client.event-threads: 4
> performance.client-io-threads: on
> performance.md-cache-timeout: 600
> server.event-threads: 4
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.stat-prefetch: on
> performance.cache-invalidation: on
> network.inode-lru-limit: 90000
> performance.cache-refresh-timeout: 10
> performance.enable-least-priority: off
> performance.cache-size: 2GB
> cluster.nufa: on
> cluster.choose-local: on
>
>
> On Tue, Sep 18, 2018 at 6:48 AM, Nicolas <nicolas at furyweb.fr> wrote:
>
>> Hello,
>>
>> I have very bad performance with glusterFS 3.12.14 with small files
>> especially when working with git repositories.
>>
>> Here is my configuration :
>> 3 nodes gluster (VMware guest v13 on vSphere 6.5 hosted by Gen8 blades
>> attached to 3PAR SSD RAID5 LUNs), gluster volume type replica 3 with
>> arbiter, SSL enabled, NFS disabled, heartbeat IP between both main nodes.
>> Trusted storage pool on Debian 9 x64
>> Client on Debian 8 x64 with native gluster client
>> Network bandwith verified with iperf between client and each storage node
>> (~900Mb/s)
>> Disk bandwith verified with dd on each storage node (~90MB/s)
>> _____________________________________________________________
>> Volume Name: perftest
>> Type: Replicate
>> Volume ID: c60b3744-7955-4058-b276-69d7b97de8aa
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: glusterVM1:/bricks/perftest/brick1/data
>> Brick2: glusterVM2:/bricks/perftest/brick1/data
>> Brick3: glusterVM3:/bricks/perftest/brick1/data (arbiter)
>> Options Reconfigured:
>> cluster.data-self-heal-algorithm: full
>> features.trash: off
>> diagnostics.client-log-level: ERROR
>> ssl.cipher-list: HIGH:!SSLv2
>> server.ssl: on
>> client.ssl: on
>> transport.address-family: inet
>> nfs.disable: on
>> _____________________________________________________________
>>
>> I made a test script that try several parameters but every test gives
>> similar measures (except for performance.write-behind), ~30s average for a
>> git clone that take only 3s on NAS volume.
>> _____________________________________________________________
>> #!/bin/bash
>>
>> trap "[ -d /mnt/project ] && rm -rf /mnt/project; grep -q /mnt
>> /proc/mounts && umount /mnt; exit" 2
>>
>> LOG=$(mktemp)
>> for params in \
>>   "server.event-threads 5" \
>> "client.event-threads 5" \
>> "cluster.lookup-optimize on" \
>> "cluster.readdir-optimize on" \
>> "features.cache-invalidation on" \
>> "features.cache-invalidation-timeout 5" \
>> "performance.cache-invalidation on" \
>> "performance.cache-refresh-timeout 5" \
>> "performance.client-io-threads on" \
>> "performance.flush-behind on" \
>> "performance.io-thread-count 6" \
>> "performance.quick-read on" \
>> "performance.read-ahead enable" \
>> "performance.readdir-ahead enable" \
>> "performance.stat-prefetch on" \
>> "performance.write-behind on" \
>> "performance.write-behind-window-size 2MB"; do
>>   set $params
>>   echo -n "gluster volume set perftest $1 $2 -> "
>>   ssh -n glusterVM3 "gluster volume set perftest $1 $2"
>> done
>> echo "NAS Reference"
>> sh -c "time -o $LOG -f '%E %P' git clone git at gitlab.local:grp/project.git
>> /share/nas >/dev/null 2>&1"
>> cat $LOG
>> rm -rf /share/nas/project
>>
>> for params in \
>>   "server.event-threads 5 6 7" \
>>   "client.event-threads 5 6 7" \
>>   "cluster.lookup-optimize on off on" \
>>   "cluster.readdir-optimize on off on" \
>>   "features.cache-invalidation on off on" \
>>   "features.cache-invalidation-timeout 5 10 15 20 30 45 60 90 120" \
>>   "performance.cache-invalidation on off on" \
>>   "performance.cache-refresh-timeout 1 5 10 15 20 30 45 60" \
>>   "performance.client-io-threads on off on" \
>>   "performance.flush-behind on off on" \
>>   "performance.io-thread-count 6 7 8 9 10" \
>>   "performance.quick-read on off on" \
>>   "performance.read-ahead enable disable enable" \
>>   "performance.readdir-ahead enable disable enable" \
>>   "performance.stat-prefetch on off on" \
>>   "performance.write-behind on off on" \
>>   "performance.write-behind-window-size 2MB 4MB 8MB 16MB"; do
>>   set $params
>>   param=$1
>>   shift
>>   for value in $*; do
>>     echo -en "\nTesting $param=$value -> "
>>     #ssh -n glusterVM3 "yes | gluster volume stop perftest force; gluster
>> volume set perftest $param $value; gluster volume start perftest"
>>     ssh -n glusterVM3 "gluster volume set perftest $param $value"
>>     if mount -t glusterfs -o defaults,direct-io-mode=enable
>> glusterVMa:perftest /mnt; then
>>       for i in $(seq 1 5); do
>>         sh -c "time -o $LOG -f '%E %P' git clone git at gitlab.local
>> :grp/project.git /mnt/bench >/dev/null 2>&1"
>>         cat $LOG
>>         rm -rf /mnt/bench
>>       done
>>       umount /mnt
>>     else
>>       echo "*** FAIL"
>>       exit
>>     fi
>>   done
>> done
>>
>> rm $LOG
>> _____________________________________________________________
>>
>> Output produced by the script
>> _____________________________________________________________
>> gluster volume set perftest server.event-threads 5 -> volume set: success
>> gluster volume set perftest client.event-threads 5 -> volume set: success
>> gluster volume set perftest cluster.lookup-optimize on -> volume set:
>> success
>> gluster volume set perftest cluster.readdir-optimize on -> volume set:
>> success
>> gluster volume set perftest features.cache-invalidation on -> volume set:
>> success
>> gluster volume set perftest features.cache-invalidation-timeout 5 ->
>> volume set: success
>> gluster volume set perftest performance.cache-invalidation on -> volume
>> set: success
>> gluster volume set perftest performance.cache-refresh-timeout 5 ->
>> volume set: success
>> gluster volume set perftest performance.client-io-threads on -> volume
>> set: success
>> gluster volume set perftest performance.flush-behind on -> volume set:
>> success
>> gluster volume set perftest performance.io-thread-count 6 -> volume set:
>> success
>> gluster volume set perftest performance.quick-read on -> volume set:
>> success
>> gluster volume set perftest performance.read-ahead enable -> volume set:
>> success
>> gluster volume set perftest performance.readdir-ahead enable -> volume
>> set: success
>> gluster volume set perftest performance.stat-prefetch on -> volume set:
>> success
>> gluster volume set perftest performance.write-behind on -> volume set:
>> success
>> gluster volume set perftest performance.write-behind-window-size 2MB ->
>> volume set: success
>> NAS Reference
>> 0:03.59 23%
>>
>> Testing server.event-threads=5 -> volume set: success
>> 0:29.45 2%
>> 0:27.07 2%
>> 0:24.89 2%
>> 0:24.93 2%
>> 0:24.64 3%
>>
>> Testing server.event-threads=6 -> volume set: success
>> 0:24.14 3%
>> 0:24.69 2%
>> 0:26.81 2%
>> 0:27.38 2%
>> 0:25.59 2%
>>
>> Testing server.event-threads=7 -> volume set: success
>> 0:25.34 2%
>> 0:24.14 2%
>> 0:25.92 2%
>> 0:23.62 2%
>> 0:24.76 2%
>>
>> Testing client.event-threads=5 -> volume set: success
>> 0:24.60 3%
>> 0:29.40 2%
>> 0:34.78 2%
>> 0:33.99 2%
>> 0:33.54 2%
>>
>> Testing client.event-threads=6 -> volume set: success
>> 0:23.82 3%
>> 0:24.64 2%
>> 0:26.10 3%
>> 0:24.56 2%
>> 0:28.21 2%
>>
>> Testing client.event-threads=7 -> volume set: success
>> 0:28.15 2%
>> 0:35.19 2%
>> 0:24.03 2%
>> 0:24.79 2%
>> 0:26.55 2%
>>
>> Testing cluster.lookup-optimize=on -> volume set: success
>> 0:30.67 2%
>> 0:30.49 2%
>> 0:31.52 2%
>> 0:33.13 2%
>> 0:32.41 2%
>>
>> Testing cluster.lookup-optimize=off -> volume set: success
>> 0:25.82 2%
>> 0:25.59 2%
>> 0:28.24 2%
>> 0:31.90 2%
>> 0:33.52 2%
>>
>> Testing cluster.lookup-optimize=on -> volume set: success
>> 0:29.33 2%
>> 0:24.82 2%
>> 0:25.93 2%
>> 0:25.36 2%
>> 0:24.89 2%
>>
>> Testing cluster.readdir-optimize=on -> volume set: success
>> 0:24.98 2%
>> 0:25.03 2%
>> 0:27.47 2%
>> 0:28.13 2%
>> 0:27.41 2%
>>
>> Testing cluster.readdir-optimize=off -> volume set: success
>> 0:32.54 2%
>> 0:32.50 2%
>> 0:25.56 2%
>> 0:25.21 2%
>> 0:27.39 2%
>>
>> Testing cluster.readdir-optimize=on -> volume set: success
>> 0:27.68 2%
>> 0:29.33 2%
>> 0:25.50 2%
>> 0:25.17 2%
>> 0:26.00 2%
>>
>> Testing features.cache-invalidation=on -> volume set: success
>> 0:25.63 2%
>> 0:25.46 3%
>> 0:25.55 3%
>> 0:26.13 2%
>> 0:25.13 2%
>>
>> Testing features.cache-invalidation=off -> volume set: success
>> 0:27.79 2%
>> 0:25.31 2%
>> 0:24.75 2%
>> 0:27.75 2%
>> 0:32.67 2%
>>
>> Testing features.cache-invalidation=on -> volume set: success
>> 0:26.34 2%
>> 0:26.60 2%
>> 0:26.32 2%
>> 0:31.05 3%
>> 0:33.58 2%
>>
>> Testing features.cache-invalidation-timeout=5 -> volume set: success
>> 0:25.89 3%
>> 0:25.07 3%
>> 0:25.49 2%
>> 0:25.44 3%
>> 0:25.47 2%
>>
>> Testing features.cache-invalidation-timeout=10 -> volume set: success
>> 0:32.34 2%
>> 0:28.27 3%
>> 0:27.41 2%
>> 0:25.17 2%
>> 0:25.56 2%
>>
>> Testing features.cache-invalidation-timeout=15 -> volume set: success
>> 0:27.79 2%
>> 0:30.58 2%
>> 0:31.63 2%
>> 0:26.71 2%
>> 0:29.69 2%
>>
>> Testing features.cache-invalidation-timeout=20 -> volume set: success
>> 0:26.62 2%
>> 0:23.76 3%
>> 0:24.17 3%
>> 0:24.99 2%
>> 0:25.31 2%
>>
>> Testing features.cache-invalidation-timeout=30 -> volume set: success
>> 0:25.75 3%
>> 0:27.34 2%
>> 0:28.38 2%
>> 0:27.15 2%
>> 0:30.91 2%
>>
>> Testing features.cache-invalidation-timeout=45 -> volume set: success
>> 0:24.77 2%
>> 0:24.81 2%
>> 0:28.22 2%
>> 0:32.56 2%
>> 0:40.81 1%
>>
>> Testing features.cache-invalidation-timeout=60 -> volume set: success
>> 0:31.97 2%
>> 0:27.14 2%
>> 0:24.53 3%
>> 0:25.48 3%
>> 0:25.27 3%
>>
>> Testing features.cache-invalidation-timeout=90 -> volume set: success
>> 0:25.24 3%
>> 0:26.83 3%
>> 0:32.74 2%
>> 0:26.82 3%
>> 0:27.69 2%
>>
>> Testing features.cache-invalidation-timeout=120 -> volume set: success
>> 0:24.50 3%
>> 0:25.43 3%
>> 0:26.21 3%
>> 0:30.09 2%
>> 0:32.24 2%
>>
>> Testing performance.cache-invalidation=on -> volume set: success
>> 0:28.77 3%
>> 0:37.16 2%
>> 0:42.56 1%
>> 0:26.21 2%
>> 0:27.91 3%
>>
>> Testing performance.cache-invalidation=off -> volume set: success
>> 0:31.05 2%
>> 0:34.40 2%
>> 0:33.90 2%
>> 0:33.12 2%
>> 0:27.84 3%
>>
>> Testing performance.cache-invalidation=on -> volume set: success
>> 0:27.17 3%
>> 0:26.73 3%
>> 0:24.61 3%
>> 0:26.36 3%
>> 0:39.90 2%
>>
>> Testing performance.cache-refresh-timeout=1 -> volume set: success
>> 0:26.83 3%
>> 0:36.17 2%
>> 0:31.37 2%
>> 0:26.12 3%
>> 0:26.46 2%
>>
>> Testing performance.cache-refresh-timeout=5 -> volume set: success
>> 0:24.95 3%
>> 0:27.33 3%
>> 0:30.77 2%
>> 0:26.77 3%
>> 0:34.62 2%
>>
>> Testing performance.cache-refresh-timeout=10 -> volume set: success
>> 0:29.36 2%
>> 0:26.04 3%
>> 0:26.21 3%
>> 0:29.47 3%
>> 0:28.67 3%
>>
>> Testing performance.cache-refresh-timeout=15 -> volume set: success
>> 0:29.26 3%
>> 0:27.31 3%
>> 0:27.15 3%
>> 0:29.74 3%
>> 0:32.70 2%
>>
>> Testing performance.cache-refresh-timeout=20 -> volume set: success
>> 0:27.99 3%
>> 0:30.13 2%
>> 0:29.39 3%
>> 0:28.59 3%
>> 0:31.30 3%
>>
>> Testing performance.cache-refresh-timeout=30 -> volume set: success
>> 0:27.47 3%
>> 0:26.68 3%
>> 0:27.09 3%
>> 0:27.08 3%
>> 0:31.72 3%
>>
>> Testing performance.cache-refresh-timeout=45 -> volume set: success
>> 0:28.83 3%
>> 0:29.21 3%
>> 0:38.75 2%
>> 0:26.15 3%
>> 0:26.76 3%
>>
>> Testing performance.cache-refresh-timeout=60 -> volume set: success
>> 0:29.64 2%
>> 0:29.71 2%
>> 0:31.41 2%
>> 0:28.35 3%
>> 0:26.26 3%
>>
>> Testing performance.client-io-threads=on -> volume set: success
>> 0:25.14 3%
>> 0:26.64 3%
>> 0:26.43 3%
>> 0:25.63 3%
>> 0:27.89 3%
>>
>> Testing performance.client-io-threads=off -> volume set: success
>> 0:31.37 2%
>> 0:33.65 2%
>> 0:28.85 3%
>> 0:28.27 3%
>> 0:26.90 3%
>>
>> Testing performance.client-io-threads=on -> volume set: success
>> 0:26.12 3%
>> 0:25.92 3%
>> 0:28.30 3%
>> 0:39.20 2%
>> 0:28.45 3%
>>
>> Testing performance.flush-behind=on -> volume set: success
>> 0:34.83 2%
>> 0:27.33 3%
>> 0:31.30 2%
>> 0:26.40 3%
>> 0:27.49 2%
>>
>> Testing performance.flush-behind=off -> volume set: success
>> 0:30.64 2%
>> 0:31.60 2%
>> 0:33.22 2%
>> 0:25.67 2%
>> 0:26.85 3%
>>
>> Testing performance.flush-behind=on -> volume set: success
>> 0:26.75 3%
>> 0:26.67 3%
>> 0:30.52 3%
>> 0:38.60 2%
>> 0:34.69 3%
>>
>> Testing performance.io-thread-count=6 -> volume set: success
>> 0:30.87 2%
>> 0:34.27 2%
>> 0:34.08 2%
>> 0:28.70 2%
>> 0:32.83 2%
>>
>> Testing performance.io-thread-count=7 -> volume set: success
>> 0:32.14 2%
>> 0:43.08 1%
>> 0:31.79 2%
>> 0:25.93 3%
>> 0:26.82 2%
>>
>> Testing performance.io-thread-count=8 -> volume set: success
>> 0:29.89 2%
>> 0:28.69 2%
>> 0:34.19 2%
>> 0:40.00 1%
>> 0:37.42 2%
>>
>> Testing performance.io-thread-count=9 -> volume set: success
>> 0:26.50 3%
>> 0:26.99 2%
>> 0:27.05 2%
>> 0:32.22 2%
>> 0:31.63 2%
>>
>> Testing performance.io-thread-count=10 -> volume set: success
>> 0:29.13 2%
>> 0:30.60 2%
>> 0:25.19 2%
>> 0:24.28 3%
>> 0:25.40 3%
>>
>> Testing performance.quick-read=on -> volume set: success
>> 0:26.40 3%
>> 0:27.37 2%
>> 0:28.03 2%
>> 0:28.07 2%
>> 0:33.47 2%
>>
>> Testing performance.quick-read=off -> volume set: success
>> 0:30.99 2%
>> 0:27.16 2%
>> 0:25.34 3%
>> 0:27.58 3%
>> 0:27.67 3%
>>
>> Testing performance.quick-read=on -> volume set: success
>> 0:27.37 2%
>> 0:26.99 3%
>> 0:29.78 2%
>> 0:26.06 2%
>> 0:25.67 2%
>>
>> Testing performance.read-ahead=enable -> volume set: success
>> 0:24.52 3%
>> 0:26.05 2%
>> 0:32.37 2%
>> 0:30.27 2%
>> 0:25.70 3%
>>
>> Testing performance.read-ahead=disable -> volume set: success
>> 0:26.98 3%
>> 0:25.54 3%
>> 0:25.55 3%
>> 0:30.78 2%
>> 0:28.07 2%
>>
>> Testing performance.read-ahead=enable -> volume set: success
>> 0:30.34 2%
>> 0:33.93 2%
>> 0:30.26 2%
>> 0:28.18 2%
>> 0:27.06 3%
>>
>> Testing performance.readdir-ahead=enable -> volume set: success
>> 0:26.31 3%
>> 0:25.64 3%
>> 0:31.97 2%
>> 0:30.75 2%
>> 0:26.10 3%
>>
>> Testing performance.readdir-ahead=disable -> volume set: success
>> 0:27.50 3%
>> 0:27.19 3%
>> 0:27.67 3%
>> 0:26.99 3%
>> 0:28.25 3%
>>
>> Testing performance.readdir-ahead=enable -> volume set: success
>> 0:34.94 2%
>> 0:30.43 2%
>> 0:27.14 3%
>> 0:27.81 2%
>> 0:26.36 3%
>>
>> Testing performance.stat-prefetch=on -> volume set: success
>> 0:28.55 3%
>> 0:27.10 2%
>> 0:26.64 3%
>> 0:30.84 3%
>> 0:35.45 2%
>>
>> Testing performance.stat-prefetch=off -> volume set: success
>> 0:29.12 3%
>> 0:36.54 2%
>> 0:26.32 3%
>> 0:29.02 3%
>> 0:27.16 3%
>>
>> Testing performance.stat-prefetch=on -> volume set: success
>> 0:31.17 2%
>> 0:34.64 2%
>> 0:26.50 3%
>> 0:30.39 2%
>> 0:27.12 3%
>>
>> Testing performance.write-behind=on -> volume set: success
>> 0:29.77 2%
>> 0:28.00 2%
>> 0:28.98 3%
>> 0:29.83 3%
>> 0:28.87 3%
>>
>> Testing performance.write-behind=off -> volume set: success
>> 1:11.95 1%
>> 1:06.03 1%
>> 1:07.70 1%
>> 1:30.21 1%
>> 1:08.47 1%
>>
>> Testing performance.write-behind=on -> volume set: success
>> 0:30.14 2%
>> 0:28.99 2%
>> 0:34.51 2%
>> 0:32.60 2%
>> 0:30.54 2%
>>
>> Testing performance.write-behind-window-size=2MB -> volume set: success
>> 0:24.74 3%
>> 0:25.71 2%
>> 0:27.49 2%
>> 0:25.78 3%
>> 0:26.35 3%
>>
>> Testing performance.write-behind-window-size=4MB -> volume set: success
>> 0:34.21 2%
>> 0:27.31 3%
>> 0:28.83 2%
>> 0:28.91 2%
>> 0:25.73 3%
>>
>> Testing performance.write-behind-window-size=8MB -> volume set: success
>> 0:24.41 3%
>> 0:26.23 2%
>> 0:25.20 3%
>> 0:26.00 2%
>> 0:27.04 2%
>>
>> Testing performance.write-behind-window-size=16MB -> volume set: success
>> 0:27.92 2%
>> 0:24.69 2%
>> 0:24.67 2%
>> 0:24.13 2%
>> 0:23.55 3%
>> _____________________________________________________________
>>
>> If someone has an idea to significantly improve performance I'll be very
>> interested.
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180923/a08cbbda/attachment.html>


More information about the Gluster-users mailing list