[Gluster-users] Poor performance compared to Netapp NAS with small files

Nicolas nicolas at furyweb.fr
Tue Sep 18 10:48:11 UTC 2018


Hello, 

I have very bad performance with glusterFS 3.12.14 with small files especially when working with git repositories. 

Here is my configuration : 
3 nodes gluster (VMware guest v13 on vSphere 6.5 hosted by Gen8 blades attached to 3PAR SSD RAID5 LUNs), gluster volume type replica 3 with arbiter, SSL enabled, NFS disabled, heartbeat IP between both main nodes. 
Trusted storage pool on Debian 9 x64 
Client on Debian 8 x64 with native gluster client 
Network bandwith verified with iperf between client and each storage node (~900Mb/s) 
Disk bandwith verified with dd on each storage node (~90MB/s) 
_____________________________________________________________ 
Volume Name: perftest 
Type: Replicate 
Volume ID: c60b3744-7955-4058-b276-69d7b97de8aa 
Status: Started 
Snapshot Count: 0 
Number of Bricks: 1 x (2 + 1) = 3 
Transport-type: tcp 
Bricks: 
Brick1: glusterVM1:/bricks/perftest/brick1/data 
Brick2: glusterVM2:/bricks/perftest/brick1/data 
Brick3: glusterVM3:/bricks/perftest/brick1/data (arbiter) 
Options Reconfigured: 
cluster.data-self-heal-algorithm: full 
features.trash: off 
diagnostics.client-log-level: ERROR 
ssl.cipher-list: HIGH:!SSLv2 
server.ssl: on 
client.ssl: on 
transport.address-family: inet 
nfs.disable: on 
_____________________________________________________________ 

I made a test script that try several parameters but every test gives similar measures (except for performance.write-behind), ~30s average for a git clone that take only 3s on NAS volume. 
_____________________________________________________________ 
#!/bin/bash 

trap "[ -d /mnt/project ] && rm -rf /mnt/project; grep -q /mnt /proc/mounts && umount /mnt; exit" 2 

LOG=$(mktemp) 
for params in \ 
"server.event-threads 5" \ 
"client.event-threads 5" \ 
"cluster.lookup-optimize on" \ 
"cluster.readdir-optimize on" \ 
"features.cache-invalidation on" \ 
"features.cache-invalidation-timeout 5" \ 
"performance.cache-invalidation on" \ 
"performance.cache-refresh-timeout 5" \ 
"performance.client-io-threads on" \ 
"performance.flush-behind on" \ 
"performance.io-thread-count 6" \ 
"performance.quick-read on" \ 
"performance.read-ahead enable" \ 
"performance.readdir-ahead enable" \ 
"performance.stat-prefetch on" \ 
"performance.write-behind on" \ 
"performance.write-behind-window-size 2MB"; do 
set $params 
echo -n "gluster volume set perftest $1 $2 -> " 
ssh -n glusterVM3 "gluster volume set perftest $1 $2" 
done 
echo "NAS Reference" 
sh -c "time -o $LOG -f '%E %P' git clone git at gitlab.local:grp/project.git /share/nas >/dev/null 2>&1" 
cat $LOG 
rm -rf /share/nas/project 

for params in \ 
"server.event-threads 5 6 7" \ 
"client.event-threads 5 6 7" \ 
"cluster.lookup-optimize on off on" \ 
"cluster.readdir-optimize on off on" \ 
"features.cache-invalidation on off on" \ 
"features.cache-invalidation-timeout 5 10 15 20 30 45 60 90 120" \ 
"performance.cache-invalidation on off on" \ 
"performance.cache-refresh-timeout 1 5 10 15 20 30 45 60" \ 
"performance.client-io-threads on off on" \ 
"performance.flush-behind on off on" \ 
"performance.io-thread-count 6 7 8 9 10" \ 
"performance.quick-read on off on" \ 
"performance.read-ahead enable disable enable" \ 
"performance.readdir-ahead enable disable enable" \ 
"performance.stat-prefetch on off on" \ 
"performance.write-behind on off on" \ 
"performance.write-behind-window-size 2MB 4MB 8MB 16MB"; do 
set $params 
param=$1 
shift 
for value in $*; do 
echo -en "\nTesting $param=$value -> " 
#ssh -n glusterVM3 "yes | gluster volume stop perftest force; gluster volume set perftest $param $value; gluster volume start perftest" 
ssh -n glusterVM3 "gluster volume set perftest $param $value" 
if mount -t glusterfs -o defaults,direct-io-mode=enable glusterVMa:perftest /mnt; then 
for i in $(seq 1 5); do 
sh -c "time -o $LOG -f '%E %P' git clone git at gitlab.local:grp/project.git /mnt/bench >/dev/null 2>&1" 
cat $LOG 
rm -rf /mnt/bench 
done 
umount /mnt 
else 
echo "*** FAIL" 
exit 
fi 
done 
done 

rm $LOG 
_____________________________________________________________ 

Output produced by the script 
_____________________________________________________________ 
gluster volume set perftest server.event-threads 5 -> volume set: success 
gluster volume set perftest client.event-threads 5 -> volume set: success 
gluster volume set perftest cluster.lookup-optimize on -> volume set: success 
gluster volume set perftest cluster.readdir-optimize on -> volume set: success 
gluster volume set perftest features.cache-invalidation on -> volume set: success 
gluster volume set perftest features.cache-invalidation-timeout 5 -> volume set: success 
gluster volume set perftest performance.cache-invalidation on -> volume set: success 
gluster volume set perftest performance.cache-refresh-timeout 5 -> volume set: success 
gluster volume set perftest performance.client-io-threads on -> volume set: success 
gluster volume set perftest performance.flush-behind on -> volume set: success 
gluster volume set perftest performance.io-thread-count 6 -> volume set: success 
gluster volume set perftest performance.quick-read on -> volume set: success 
gluster volume set perftest performance.read-ahead enable -> volume set: success 
gluster volume set perftest performance.readdir-ahead enable -> volume set: success 
gluster volume set perftest performance.stat-prefetch on -> volume set: success 
gluster volume set perftest performance.write-behind on -> volume set: success 
gluster volume set perftest performance.write-behind-window-size 2MB -> volume set: success 
NAS Reference 
0:03.59 23% 

Testing server.event-threads=5 -> volume set: success 
0:29.45 2% 
0:27.07 2% 
0:24.89 2% 
0:24.93 2% 
0:24.64 3% 

Testing server.event-threads=6 -> volume set: success 
0:24.14 3% 
0:24.69 2% 
0:26.81 2% 
0:27.38 2% 
0:25.59 2% 

Testing server.event-threads=7 -> volume set: success 
0:25.34 2% 
0:24.14 2% 
0:25.92 2% 
0:23.62 2% 
0:24.76 2% 

Testing client.event-threads=5 -> volume set: success 
0:24.60 3% 
0:29.40 2% 
0:34.78 2% 
0:33.99 2% 
0:33.54 2% 

Testing client.event-threads=6 -> volume set: success 
0:23.82 3% 
0:24.64 2% 
0:26.10 3% 
0:24.56 2% 
0:28.21 2% 

Testing client.event-threads=7 -> volume set: success 
0:28.15 2% 
0:35.19 2% 
0:24.03 2% 
0:24.79 2% 
0:26.55 2% 

Testing cluster.lookup-optimize=on -> volume set: success 
0:30.67 2% 
0:30.49 2% 
0:31.52 2% 
0:33.13 2% 
0:32.41 2% 

Testing cluster.lookup-optimize=off -> volume set: success 
0:25.82 2% 
0:25.59 2% 
0:28.24 2% 
0:31.90 2% 
0:33.52 2% 

Testing cluster.lookup-optimize=on -> volume set: success 
0:29.33 2% 
0:24.82 2% 
0:25.93 2% 
0:25.36 2% 
0:24.89 2% 

Testing cluster.readdir-optimize=on -> volume set: success 
0:24.98 2% 
0:25.03 2% 
0:27.47 2% 
0:28.13 2% 
0:27.41 2% 

Testing cluster.readdir-optimize=off -> volume set: success 
0:32.54 2% 
0:32.50 2% 
0:25.56 2% 
0:25.21 2% 
0:27.39 2% 

Testing cluster.readdir-optimize=on -> volume set: success 
0:27.68 2% 
0:29.33 2% 
0:25.50 2% 
0:25.17 2% 
0:26.00 2% 

Testing features.cache-invalidation=on -> volume set: success 
0:25.63 2% 
0:25.46 3% 
0:25.55 3% 
0:26.13 2% 
0:25.13 2% 

Testing features.cache-invalidation=off -> volume set: success 
0:27.79 2% 
0:25.31 2% 
0:24.75 2% 
0:27.75 2% 
0:32.67 2% 

Testing features.cache-invalidation=on -> volume set: success 
0:26.34 2% 
0:26.60 2% 
0:26.32 2% 
0:31.05 3% 
0:33.58 2% 

Testing features.cache-invalidation-timeout=5 -> volume set: success 
0:25.89 3% 
0:25.07 3% 
0:25.49 2% 
0:25.44 3% 
0:25.47 2% 

Testing features.cache-invalidation-timeout=10 -> volume set: success 
0:32.34 2% 
0:28.27 3% 
0:27.41 2% 
0:25.17 2% 
0:25.56 2% 

Testing features.cache-invalidation-timeout=15 -> volume set: success 
0:27.79 2% 
0:30.58 2% 
0:31.63 2% 
0:26.71 2% 
0:29.69 2% 

Testing features.cache-invalidation-timeout=20 -> volume set: success 
0:26.62 2% 
0:23.76 3% 
0:24.17 3% 
0:24.99 2% 
0:25.31 2% 

Testing features.cache-invalidation-timeout=30 -> volume set: success 
0:25.75 3% 
0:27.34 2% 
0:28.38 2% 
0:27.15 2% 
0:30.91 2% 

Testing features.cache-invalidation-timeout=45 -> volume set: success 
0:24.77 2% 
0:24.81 2% 
0:28.22 2% 
0:32.56 2% 
0:40.81 1% 

Testing features.cache-invalidation-timeout=60 -> volume set: success 
0:31.97 2% 
0:27.14 2% 
0:24.53 3% 
0:25.48 3% 
0:25.27 3% 

Testing features.cache-invalidation-timeout=90 -> volume set: success 
0:25.24 3% 
0:26.83 3% 
0:32.74 2% 
0:26.82 3% 
0:27.69 2% 

Testing features.cache-invalidation-timeout=120 -> volume set: success 
0:24.50 3% 
0:25.43 3% 
0:26.21 3% 
0:30.09 2% 
0:32.24 2% 

Testing performance.cache-invalidation=on -> volume set: success 
0:28.77 3% 
0:37.16 2% 
0:42.56 1% 
0:26.21 2% 
0:27.91 3% 

Testing performance.cache-invalidation=off -> volume set: success 
0:31.05 2% 
0:34.40 2% 
0:33.90 2% 
0:33.12 2% 
0:27.84 3% 

Testing performance.cache-invalidation=on -> volume set: success 
0:27.17 3% 
0:26.73 3% 
0:24.61 3% 
0:26.36 3% 
0:39.90 2% 

Testing performance.cache-refresh-timeout=1 -> volume set: success 
0:26.83 3% 
0:36.17 2% 
0:31.37 2% 
0:26.12 3% 
0:26.46 2% 

Testing performance.cache-refresh-timeout=5 -> volume set: success 
0:24.95 3% 
0:27.33 3% 
0:30.77 2% 
0:26.77 3% 
0:34.62 2% 

Testing performance.cache-refresh-timeout=10 -> volume set: success 
0:29.36 2% 
0:26.04 3% 
0:26.21 3% 
0:29.47 3% 
0:28.67 3% 

Testing performance.cache-refresh-timeout=15 -> volume set: success 
0:29.26 3% 
0:27.31 3% 
0:27.15 3% 
0:29.74 3% 
0:32.70 2% 

Testing performance.cache-refresh-timeout=20 -> volume set: success 
0:27.99 3% 
0:30.13 2% 
0:29.39 3% 
0:28.59 3% 
0:31.30 3% 

Testing performance.cache-refresh-timeout=30 -> volume set: success 
0:27.47 3% 
0:26.68 3% 
0:27.09 3% 
0:27.08 3% 
0:31.72 3% 

Testing performance.cache-refresh-timeout=45 -> volume set: success 
0:28.83 3% 
0:29.21 3% 
0:38.75 2% 
0:26.15 3% 
0:26.76 3% 

Testing performance.cache-refresh-timeout=60 -> volume set: success 
0:29.64 2% 
0:29.71 2% 
0:31.41 2% 
0:28.35 3% 
0:26.26 3% 

Testing performance.client-io-threads=on -> volume set: success 
0:25.14 3% 
0:26.64 3% 
0:26.43 3% 
0:25.63 3% 
0:27.89 3% 

Testing performance.client-io-threads=off -> volume set: success 
0:31.37 2% 
0:33.65 2% 
0:28.85 3% 
0:28.27 3% 
0:26.90 3% 

Testing performance.client-io-threads=on -> volume set: success 
0:26.12 3% 
0:25.92 3% 
0:28.30 3% 
0:39.20 2% 
0:28.45 3% 

Testing performance.flush-behind=on -> volume set: success 
0:34.83 2% 
0:27.33 3% 
0:31.30 2% 
0:26.40 3% 
0:27.49 2% 

Testing performance.flush-behind=off -> volume set: success 
0:30.64 2% 
0:31.60 2% 
0:33.22 2% 
0:25.67 2% 
0:26.85 3% 

Testing performance.flush-behind=on -> volume set: success 
0:26.75 3% 
0:26.67 3% 
0:30.52 3% 
0:38.60 2% 
0:34.69 3% 

Testing performance.io-thread-count=6 -> volume set: success 
0:30.87 2% 
0:34.27 2% 
0:34.08 2% 
0:28.70 2% 
0:32.83 2% 

Testing performance.io-thread-count=7 -> volume set: success 
0:32.14 2% 
0:43.08 1% 
0:31.79 2% 
0:25.93 3% 
0:26.82 2% 

Testing performance.io-thread-count=8 -> volume set: success 
0:29.89 2% 
0:28.69 2% 
0:34.19 2% 
0:40.00 1% 
0:37.42 2% 

Testing performance.io-thread-count=9 -> volume set: success 
0:26.50 3% 
0:26.99 2% 
0:27.05 2% 
0:32.22 2% 
0:31.63 2% 

Testing performance.io-thread-count=10 -> volume set: success 
0:29.13 2% 
0:30.60 2% 
0:25.19 2% 
0:24.28 3% 
0:25.40 3% 

Testing performance.quick-read=on -> volume set: success 
0:26.40 3% 
0:27.37 2% 
0:28.03 2% 
0:28.07 2% 
0:33.47 2% 

Testing performance.quick-read=off -> volume set: success 
0:30.99 2% 
0:27.16 2% 
0:25.34 3% 
0:27.58 3% 
0:27.67 3% 

Testing performance.quick-read=on -> volume set: success 
0:27.37 2% 
0:26.99 3% 
0:29.78 2% 
0:26.06 2% 
0:25.67 2% 

Testing performance.read-ahead=enable -> volume set: success 
0:24.52 3% 
0:26.05 2% 
0:32.37 2% 
0:30.27 2% 
0:25.70 3% 

Testing performance.read-ahead=disable -> volume set: success 
0:26.98 3% 
0:25.54 3% 
0:25.55 3% 
0:30.78 2% 
0:28.07 2% 

Testing performance.read-ahead=enable -> volume set: success 
0:30.34 2% 
0:33.93 2% 
0:30.26 2% 
0:28.18 2% 
0:27.06 3% 

Testing performance.readdir-ahead=enable -> volume set: success 
0:26.31 3% 
0:25.64 3% 
0:31.97 2% 
0:30.75 2% 
0:26.10 3% 

Testing performance.readdir-ahead=disable -> volume set: success 
0:27.50 3% 
0:27.19 3% 
0:27.67 3% 
0:26.99 3% 
0:28.25 3% 

Testing performance.readdir-ahead=enable -> volume set: success 
0:34.94 2% 
0:30.43 2% 
0:27.14 3% 
0:27.81 2% 
0:26.36 3% 

Testing performance.stat-prefetch=on -> volume set: success 
0:28.55 3% 
0:27.10 2% 
0:26.64 3% 
0:30.84 3% 
0:35.45 2% 

Testing performance.stat-prefetch=off -> volume set: success 
0:29.12 3% 
0:36.54 2% 
0:26.32 3% 
0:29.02 3% 
0:27.16 3% 

Testing performance.stat-prefetch=on -> volume set: success 
0:31.17 2% 
0:34.64 2% 
0:26.50 3% 
0:30.39 2% 
0:27.12 3% 

Testing performance.write-behind=on -> volume set: success 
0:29.77 2% 
0:28.00 2% 
0:28.98 3% 
0:29.83 3% 
0:28.87 3% 

Testing performance.write-behind=off -> volume set: success 
1:11.95 1% 
1:06.03 1% 
1:07.70 1% 
1:30.21 1% 
1:08.47 1% 

Testing performance.write-behind=on -> volume set: success 
0:30.14 2% 
0:28.99 2% 
0:34.51 2% 
0:32.60 2% 
0:30.54 2% 

Testing performance.write-behind-window-size=2MB -> volume set: success 
0:24.74 3% 
0:25.71 2% 
0:27.49 2% 
0:25.78 3% 
0:26.35 3% 

Testing performance.write-behind-window-size=4MB -> volume set: success 
0:34.21 2% 
0:27.31 3% 
0:28.83 2% 
0:28.91 2% 
0:25.73 3% 

Testing performance.write-behind-window-size=8MB -> volume set: success 
0:24.41 3% 
0:26.23 2% 
0:25.20 3% 
0:26.00 2% 
0:27.04 2% 

Testing performance.write-behind-window-size=16MB -> volume set: success 
0:27.92 2% 
0:24.69 2% 
0:24.67 2% 
0:24.13 2% 
0:23.55 3% 
_____________________________________________________________ 

If someone has an idea to significantly improve performance I'll be very interested. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180918/7e4f7b75/attachment.html>


More information about the Gluster-users mailing list