[Gluster-devel] Mirrored GlusterFS -- very poor read performance

Thu Jan 2 23:26:59 UTC 2014

We are building a new web-serving farm here. Believing, like most
people, that the choice of the technology does not affect performance in
read-dominated work-loads (such as ours), we picked GlusterFS for its
rich feature set.

However, when we got to doing some testing, GlusterFS-mounted shares
lose -- by a wide margin -- not only to the SAN-connected RAIDs, but
even to NFS-mounted shares.

Here are the numbers... All of the computer-systems involved are VMWare
VMs running RHEL6. Each VM has its own dedicated SAN-connected "disk".
GlusterFS is using replicated volume -- with two bricks. Each brick is
on a VM of its own, residing on the VM's SAN-connected "disk".

The web-server is, likewise, a VM. The same set of four test-files was
placed on the web-server's own SAN-connected "disk", on an NFS-mount,
and on a GlusterFS-share. (The NFS service is by a NetApp NFS
"appliance".) Here are the corresponding lines from mount-listing:

      * Local (SAN-connected):
        /dev/mapper/vg_root-lv_data01 on /data01 type ext4 (rw)
      * NFS:
        .....nas02:/NFS-DCMS on /data03 type nfs
        (rw,nfsvers=3,rsize=32768,wsize=32768,hard,intr,tcp,timeo=600,addr=10.x.x.x)
      * GlusterFS:
        glusterfs.X:/test-ie on /mnt/glusterfs/test-ie type
        fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

As mentioned above, four test-files were used for the benchmark:

 1. Small static file - 429 bytes
 2. Larger static file - 93347 bytes
 3. Small PHP file (a single php call in it -- to |phpinfo()| function).
    Although the file is small, its output was over 64Kb.
 4. Large PHP file (apc.php). Although the file is larger, its output
    was only about 12Kb.

The tests were run using our homegrown utility, which reports average
latency of each successful request. It was configured to create 17
threads each hitting the file for 11 seconds. The timings (in
milliseconds) are in the table below:

    	Local 	NFS 	GlusterFS
    Small static file 	3.643 	

    6.801

    	22.41
    Large static file 	15.34 	15.97 	40.80
    Small PHP script 	50.58 	67.72 	77.17
    Large PHP script 	16.50 	17.81 	118.4

Discouragingly, not only GlusterFS' performance is pretty bad, the
glusterfs-process running on the web-server could be seen hogging an
entire CPU during the tests... This suggests, the bottleneck is not in
the underlying storage or network, but the CPU -- which would be quite
unusual for an IO-intensive workload. (glusterfsd-processes hosting each
brick were using about 9% of one CPU each.)

We used the "officially" provided 3.4.1 RPMs for RHEL.

Could it be, the GlusterFS developers stopped caring for
read-performance -- and stopped routinely testing it? The wording of
Performance page at Gluster.org
<http://www.gluster.org/category/performance/> has a hint of such
"arrogance":

    /Let's start with read-dominated workloads. It's well known that OS
    (and app) caches can absorb most of the reads in a system. This was
    the fundamental observation behind Seltzer et al's work on
    log-structured filesystems all those years ago. Reads often take
    care of themselves, so //*at the filesystem level*//focus on writes./

Or did we do such poor job configuring gluster here, that our setup can
be made 2-3 times faster simply by correcting our mistakes? Any
comments? Thank you!

    -mi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140102/b79cb84e/attachment-0001.html>