[Gluster-users] Gluster performance advice

Wed Jan 4 02:53:50 UTC 2017

Hi,

We recently decided to try out glusterfs out in lab as a lot of our processing is IOPS bound and our data sets are fairly large (The files that we process on are broken up in 256 GB chunks).  Our traditional storage is a 24 disk raid-6 Synology NAS with SSD cache.  The NAS has a dual 10GbE card connected to 8 computers in our lab which also have dual 10GbE operating in 802.3ad LACP.  The 8 processing nodes each have a 1 TB NVME SSD and four of the nodes have a 2 TB SATA SSD.  

For testing, I tried creating a distributed replicated volume, and a distributed volume.  I also experimented with sharding enabled and tested different shard sizes.  For purposes of testing, I created bricks on the 8 NVME SSDs using the root partition which is formatted as ext4.  I know this is considered bad practice but I could not find documentation on what could go wrong (will create dedicated XFS partitions if we decide to migrate to glusterfs).  The four 2 TB SATA SSDs are formatted with XFS.  We are using Ubuntu 16.04 with GlusterFS 3.8.7.  

When transferring a data chunk from the Synology NAS to a single NVME SSD, we get a sustained sequential transfer rate of around 1.0 GB/sec.  When testing with GlusterFS, I have not been able to get a write performance greater than 180MB/sec.  The  throughput is about the same whether I am using a distributed volume (1 copy) or a distributed replica 2 volume (twice the network bandwidth).  I hit the same performance ceiling copying from the NAS, or copying from the NVME SSD to the Gluster volume.  I haven’t done too much testing once the data makes it to the Gluster volume as the current throughput to upload data to GlusterFS would make it a no go for us.

Does anyone have any ideas on what may be my bottleneck?  Or tips on identifying the bottleneck and resolving?

Thanks,
Zack