[Gluster-users] Slow NFS performance with Replication

Wed Jul 2 19:26:41 UTC 2014

----- Original Message -----
> From: "Brent D. Kolasinski" <bkolasinski at anl.gov>
> To: gluster-users at gluster.org
> Sent: Monday, June 30, 2014 1:44:49 PM
> Subject: [Gluster-users] Slow NFS performance with Replication
> 
> Hi all,
> 
> I have been experimenting with using gluster as a VM storage backend on
> VMWare ESXi.  We are using Gluster NFS to share out storage to VMware
> ESXi.  Our current setup includes 2 storage servers, in a 1x2 replication
> pool, each with approximately 16TB of storage shared via gluster.  The NFS
> servers are connected via 10Gbps NICs to the ESXi systems, and we've
> dedicated a cross connected link for gluster replication between the
> storage servers.
> 
> After some initial testing, we are only getting approximately 160-200MBps
> on write speeds.  If we drop a brick from the volume, so replication does
> not take place, we start seeing writes on the order of 500-600MBps.  We
> would expect the writes to be in the 500MBps range with replication turned
> on, however we are seeing less than half of that over 10Gbps links.

For sequential single threaded writes over NFS on my 10G setup I get:

# dd if=/dev/zero of=./test.txt bs=1024k count=1000 conv=sync
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 2.32838 s, 450 MB/s

For glusterfs mounts I get ~600 MB / sec.  With replication you cut your bandwidth in half as write to each brick.  Try following the recommendations in:

http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf

That should get you closer to 450.

> We also notice that write heavy VMs start IO waiting quite a bit with
> replication turned on.
> 
> We have increased thread counts with the performance.* variables, but that
> has not improved our situation.  When taking VMWare out of the equation
> (by mounting directly with an NFS client on a different server), we see
> the same results.
> 
> Is this normal speed for 10Gbps interconnects with a replicate volume?
> 
> Here is our current gluster config.  We are running gluster 3.5.0:
> 
> Volume Name: gvol0
> Type: Replicate
> Volume ID: e88afc1c-50d3-4e2e-b540-4c2979219d12
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: nfs0g:/data/brick0/gvol0
> Brick2: nfs1g:/data/brick0/gvol0
> Options Reconfigured:
> nfs.disable: 0
> network.ping-timeout: 3
> 
> nfs.drc off
> 
> ----------
> Brent Kolasinski
> Computer Systems Engineer
> 
> Argonne National Laboratory
> Decision and Information Sciences
> ARM Climate Research Facility
> 
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>