[Gluster-users] Gluster Performance in an Ovirt Scenario.

Mon Apr 3 14:33:58 UTC 2017

On my side in has a productive Ovirt Cluster and try to understand my performance issue.

For history information, I start with Ovirt 3.6 and gluster 3.6 and the test are near the
same over the version.

My understanding problem is that if a oivrt server write in an disperse scenario to 4 (6) nodes,
this should near the performance from a nfs mount - but they aren't!!

All machines (Gluster and Ovirt) run Centos 7, totally upgrade with newest ML-Kernel
The network storage backbone is a 10GB net.

Gluster version 3.8.10 ( 6 Node Servers, 16GB Ram, 4 CPU)
Oivrt version 4.1  (3 Node Servers, 128GB Ram, 8 CPU)

Test 1:

The Gluster - 6 computer, every with a 4TB RED 5400upm data Disk. 
Simple single performance per Disk:
Write:  172 MB/s

Create a Disperse Volume with 4 + 2 supported configuration
and "group virt" .

Volume Name: vol01
Type: Disperse
Volume ID: ebb831b9-d65d-4583-98d7-f0b262cf124a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: vmw-lix-135:/data/brick1-1/brick01
Brick2: vmw-lix-136:/data/brick1-1/brick01
Brick3: vmw-lix-137:/data/brick1-1/brick01
Brick4: vmw-lix-138:/data/brick1-1/brick01
Brick5: vmw-lix-139:/data/brick1-1/brick01
Brick6: vmw-lix-134:/data/brick1-1/brick01
Options Reconfigured:
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

The Gluster has running virtual machines on it, verly low usage....

Performance Test with  dd  10GB Read to /dev/null and Write from /dev/zero on the Ovirt node servers
to the gluster mount.

1 Node dd 10GB multiple test
write: 80-95 MB/s   (slow)
read: 70-80 MB/s      (second read same dd file possible up to 800 MB/s - cache?)

All 3 Nodes dd run concurrent
write: 80-90 MB/s   (like a single node write, slow per node, concurrent 240MB/s input in the gluster)
read: 40-55 MB/s   (poor)

My conclusion, 
The performance per single write is 80-90MB/s   and read is slower with only 70 MB/s
Multiple write are like single write, but multiple read is poor.

Test 2.

I think I has a problem in my network or with the server, I build all 6 hard disk in one Server
and create 2 partition per 4TB Disk

Than in prepare to storages for the Ovirt Cluster.
The first 6 disk partitions with mdadm to a raid 5 and mount it as nfs data volume in ovirt
The other 6 disk partition as a disperse volume 4+2

the disperse gluster volume get  performance like before
write: 80MB/s
read: 70 MB/s

but NFS mount from the mdadm raid:

singel node dd:
write: 290 MB/s
read: 700 MB/s

3 nodes concurrent dd to nfs mount:
write: 125-140 MB/s      ( ~400 MB/s to mdadm write)
read: 400-700 MB/s       (~ 1600 MB/s from mdadm, near 10GB network speed)

On the same server and the same disks NFS has a real performance advantage!!!

The cpu was not a bottleneck during gluster operation, I has a look with htop during operation.

Can some explain why the gluster volume has not near the performance from the nfs mount
on the mdadm raid 5,  or the 6 node gluster test ...

Thanks

Thorsten