[Gluster-users] Gluster Performance in an Ovirt Scenario.

Sat Apr 8 17:32:18 UTC 2017

Sounds like you’re testing things correctly then, I may have misunderstood your concerns. If it’s just about the speed your getting, fuse mounts are well know to be slower than native disk. This is why we’re all waiting for ovirt to integrate native libglfs mounts.

A couple things about your specific setup I’d look at:
- did you tune your tcp stack? most things benefit from it, even on 10g low BDP links. 
- do your compute nodes have 10g links? if not, you’re possibly causing bottlenecks switching down to them
- large MTUs?
- raid 5 on the gluster servers is slower than a 0 or 10 would be. This is fine if you’re trying to max redundancy/reliability, but consider disk layout for speed on the servers, you’re getting more redundancy from the gluster layer so testing other layouts may be worthwhile
- looks like you’re trying to maximize space with your 1 x (4+2) layout, try a 2x(2+1) or a just a distributed replicated config, see how it looks
- ganesha NFS would get you more of the benefits of libglfs, worth testing

For comparison to your numbers, I tried your dd tests on my production cluster, with ~60 active VMS on a replica volume over 3 servers with 10G and backed by a 8 disk zfs stripe on each server. I got ~110Mbps writes and ~500Mbps reads out of it on the gluster fuse mounts and vms. Unfortunately, dd’ing zeros isn’t a valid test on the base brick directly as it’s running compression, so it showed 3Gbps write and 5Gbps reads. Realistically, the volume should be capable of around 6-800Mbps writes and 2Gbps reads, so it seems like it’s in the same ballpark as our numbers for writes.

> On Apr 4, 2017, at 2:07 AM, Thorsten Schade <Thorsten.Schade at trinovis.com> wrote:
> 
> Hello Darrell,
> 
> I user the GlusterFS mount in oivrt
> 
> during the Test 1 the Ovirt Cluster Mount the Storage as GlusterFS, 
> locks like this on the Ovirt Node.
> 
> 10.2.9.135:vol01   3% /rhev/data-center/mnt/glusterSD/10.2.9.135:vol01
> 
> And I test the dd direct under the mount point directory, mean the test 1 is a pure glusterfs 
> performance test, and I can see with itraf-ng or nload the the ovirt node use 6 network
> connections to put the data to the disperse volume.
> 
> In test 2 the the NFS to the mdadm raid 5 is simple centos7 nfs.
> 
> Thorsten
> 
> -----Ursprüngliche Nachricht-----
> Von: Darrell Budic [mailto:budic at onholyground.com] 
> Gesendet: Montag, 3. April 2017 20:58
> An: Thorsten Schade <Thorsten.Schade at trinovis.com>
> Cc: gluster-users at gluster.org
> Betreff: Re: [Gluster-users] Gluster Performance in an Ovirt Scenario.
> 
> You didn’t list your mount type for test 1, but it sounds like your NFS mounting your storage. Is this a “standard” OS level NFS server, or a Ganesha based NFS server?
> 
> If you’re using “normal” NFS, your nodes write to 1 of your gluster servers over the NFS mount, and the gluster server will write it out to all the other servers as needed before acknowledging the write as complete, limiting your total throughput. This is also true for the read case, the server you’re talking to marshals the response from all the servers before sending it along to the client.
> 
> If you use Ganesha, it may be able to read/write directly from/to all your gluster servers, which should improve your performance.
> 
> Since you’re using Ovirt, I would recommend you use gluster mounted volumes instead of NFS mounts. Even using the fuse mounts currently supported, I get better behavior from it because then nodes are still writing to all the gluster servers at the same time, which reduces the wait time on the write completions, improving throughput over the NFS case. Then you’re ready for native libgfapi support when Ovirt enables it, something I’m looking forward to myself.
> 
> I also got some performance improvement by setting higher numbers for server.event-threads and client.event-threads on my volumes. This is more setup & load dependent, so play around with it some.
> 
>  -Darrell
> 
>> On Apr 3, 2017, at 9:33 AM, Thorsten Schade <Thorsten.Schade at trinovis.com> wrote:
>> 
>> On my side in has a productive Ovirt Cluster and try to understand my performance issue.
>> 
>> For history information, I start with Ovirt 3.6 and gluster 3.6 and 
>> the test are near the same over the version.
>> 
>> My understanding problem is that if a oivrt server write in an 
>> disperse scenario to 4 (6) nodes, this should near the performance from a nfs mount - but they aren't!!
>> 
>> All machines (Gluster and Ovirt) run Centos 7, totally upgrade with 
>> newest ML-Kernel The network storage backbone is a 10GB net.
>> 
>> Gluster version 3.8.10 ( 6 Node Servers, 16GB Ram, 4 CPU) Oivrt 
>> version 4.1  (3 Node Servers, 128GB Ram, 8 CPU)
>> 
>> 
>> Test 1:
>> 
>> The Gluster - 6 computer, every with a 4TB RED 5400upm data Disk. 
>> Simple single performance per Disk:
>> Write:  172 MB/s
>> 
>> Create a Disperse Volume with 4 + 2 supported configuration and "group 
>> virt" .
>> 
>> 
>> Volume Name: vol01
>> Type: Disperse
>> Volume ID: ebb831b9-d65d-4583-98d7-f0b262cf124a
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (4 + 2) = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1: vmw-lix-135:/data/brick1-1/brick01
>> Brick2: vmw-lix-136:/data/brick1-1/brick01
>> Brick3: vmw-lix-137:/data/brick1-1/brick01
>> Brick4: vmw-lix-138:/data/brick1-1/brick01
>> Brick5: vmw-lix-139:/data/brick1-1/brick01
>> Brick6: vmw-lix-134:/data/brick1-1/brick01
>> Options Reconfigured:
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: enable
>> performance.low-prio-threads: 32
>> performance.stat-prefetch: off
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: on
>> 
>> 
>> The Gluster has running virtual machines on it, verly low usage....
>> 
>> Performance Test with  dd  10GB Read to /dev/null and Write from 
>> /dev/zero on the Ovirt node servers to the gluster mount.
>> 
>> 1 Node dd 10GB multiple test
>> write: 80-95 MB/s   (slow)
>> read: 70-80 MB/s      (second read same dd file possible up to 800 MB/s - cache?)
>> 
>> All 3 Nodes dd run concurrent
>> write: 80-90 MB/s   (like a single node write, slow per node, concurrent 240MB/s input in the gluster)
>> read: 40-55 MB/s   (poor)
>> 
>> My conclusion, 
>> The performance per single write is 80-90MB/s   and read is slower with only 70 MB/s
>> Multiple write are like single write, but multiple read is poor.
>> 
>> Test 2.
>> 
>> I think I has a problem in my network or with the server, I build all 
>> 6 hard disk in one Server and create 2 partition per 4TB Disk
>> 
>> Than in prepare to storages for the Ovirt Cluster.
>> The first 6 disk partitions with mdadm to a raid 5 and mount it as nfs 
>> data volume in ovirt The other 6 disk partition as a disperse volume 
>> 4+2
>> 
>> the disperse gluster volume get  performance like before
>> write: 80MB/s
>> read: 70 MB/s
>> 
>> but NFS mount from the mdadm raid:
>> 
>> singel node dd:
>> write: 290 MB/s
>> read: 700 MB/s
>> 
>> 3 nodes concurrent dd to nfs mount:
>> write: 125-140 MB/s      ( ~400 MB/s to mdadm write)
>> read: 400-700 MB/s       (~ 1600 MB/s from mdadm, near 10GB network speed)
>> 
>> On the same server and the same disks NFS has a real performance advantage!!!
>> 
>> The cpu was not a bottleneck during gluster operation, I has a look with htop during operation.
>> 
>> 
>> Can some explain why the gluster volume has not near the performance 
>> from the nfs mount on the mdadm raid 5,  or the 6 node gluster test ...
>> 
>> Thanks
>> 
>> Thorsten
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
>