[Gluster-users] Gluster 1.3.10 Performance Issues

Chris Davies isp at daviesinc.com
Thu Aug 7 05:05:40 UTC 2008


A continuation:

I used XFS & MD raid 1 on the partitions for the initial tests.
I tested reiser3 and reiser4 with no significant difference
I reraided to MD Raid 0 with XFS and received some improvement

I NFS mounted the partition and received bonnie++ numbers similar to  
the best clientside AFR numbers I have been able to get, but,  
unpacking the kernel using nfsv4/udp took 1 minute 47 seconds compared  
with 12 seconds for the bare drive, 41 seconds for serverside AFR and  
an average of 17 minutes for clientside AFR.

If I turn off AFR, whether I mount the remote machine over the net or  
use the local server's brick, tar xjf of a kernel takes roughly 29  
seconds.

Large files replicate almost at wire speed.  rsync/cp -Rp of a large  
directory takes considerable time.

Both QA releases I've attempted of 1.4.0 have broken within minutes  
using my configurations.  1.4.0qa32 and 1.4.0qa33.  I'll turn debug  
logs on and post summaries of those.

On Aug 6, 2008, at 2:48 PM, Chris Davies wrote:

> OS: Debian Linux/4.1, 64bit build
> Hardware: quad core xeon x3220, 8gb RAM, dual 7200RPM 1000gb WD Hard
> Drives, 750gb raid 1 partition set as /gfsvol to be exported, dual
> gigE, juniper ex3200 switch
>
> Fuse libraries: fuse-2.7.3glfs10
> Gluster: glusterfs-1.3.10
>
> Running bonnie++ on both machines results in almost identical numbers,
> eth1 is reserved wholly for server to server communications.  Right
> now, the only load on these machines comes from my testbed.  There are
> four tests that give a reasonable indicator of performance.
>
> * loading a wordpress blog and looking at the line:
> <!-- 24 queries. 0.634 seconds. -->
> * dd if=/dev/zero of=/gfs/test/out bs=1M count=512
> * time tar xjf /gfs/test/linux-2.6.26.1.tar.bz2
> * /usr/sbin/bonnie++ /gfs/test/
>
> On the wordpress test, .3 seconds is typical.  On various gluster
> configurations I've received between .411 seconds (server side afr
> config below) and 1.2 seconds with some of the example
> configurations.  Currently, my clientside AFR config comes in at .5xx
> seconds rather consistently.
>
> The second test on the clientside AFR results in 536870912 bytes (537
> MB) copied, 4.65395 s, 115 MB/s
>
> The third test is unpacking a kernel which has ranged from 28 seconds
> using the Serverside AFR to 6+ minutes on some configurations.
> Currently the clientside AFR config comes in at about 17 minutes.
>
> The fourth test is a run of bonnie++ which varies from 36 minutes on
> the serverside AFR to the 80 minute run on the clientside AFR config.
>
> Current test environment is using both servers as clients & servers --
> if I can get reasonable performance, the existing machines will become
> clients and the servers will be split to their own platform, so, I
> want to make sure I am using tcp for connections to give as close to a
> real world deployment as possible.  This means I cannot run a client-
> only config.
>
> Baseline Wordpress returns .311-.399 seconds
> Baseline dd 536870912 bytes (537 MB) copied, 0.489522 s, 1.1 GB/s
> Baseline tar xjf of the kernel, real	0m12.164s
> Baseline Config bonnie++ run on the raid 1 partition: (echo data |
> bon_csv2txt for the text reporting)
>
> c1ws1,16G,
> 66470,97,93198,16,42430,6,60253,86,97153,7,381.3,0,16,7534,37,+++++,++
> +,5957,23,7320,34,+++++,+++,4667,21
>
> So far, the best performance I could manage was Server Side AFR with
> writebehind/readahead on the server, with aggregate-size set to 0mb,
> and the client side running writebehind/readahead.  That resulted in:
>
> c1ws2,16G,
> 37636,50,76855,3,17429,2,60376,76,87653,3,158.6,0,16,1741,3,9683,6,2591,3,2030,3,9790,5,2369,3
>
> It was suggested in IRC that clientside AFR would be faster and more
> reliable, however, I've ended up with the following as the best
> results from multiple attempts:
>
> c1ws1,16G,
> 46041,58,76811,2,4603,0,59140,76,86103,3,132.4,0,16,1069,2,4795,2,1308,2,1045,2,5209,2,1246,2
>
> The bonnie++ run from the serverside AFR that resulted in the best
> results I've received to date took 34 minutes.  The latest clientside
> AFR bonnie run took 80 minutes.  Based on the website, I would expect
> to see better performance than drbd/GFS, but, so far that hasn't been
> the case.
>
> Its been suggested that I use unify-rr-afr.  In my current setup, it
> seems that to do that, I would need to break my raid set which is my
> next step in debugging this.  Rather than use Raid 1 on the server, I
> would have 2 bricks on each server which would allow the use of unify
> and the rr scheduler.
>
> glusterfs-1.4.0qa32 results in
> [Wed Aug 06 02:01:44 2008] [notice] child pid 14025 exit signal Bus
> error (7)
> [Wed Aug 06 02:01:44 2008] [notice] child pid 14037 exit signal Bus
> error (7)
>
> when apache (not mod_gluster) tries to serve files off the glusterfs
> partition.
>
> The main issue I'm having right now is file creation speed.  I realize
> that to create a file I need to do two network ops for each file
> created, but, it seems that something is horribly wrong in my
> configuration from the results in untarring the kernel.
>
> I've tried moving the performance translators around, but, some don't
> seem to make much difference on the server side, and the ones that
> appear to make some difference client side, don't seem to help the
> file creation issue.
>
> On a side note, zresearch.com, I emailed through your contact form and
> haven't heard back -- please provide a quote for generating the
> configuration and contact me offlist.
>
> ===/etc/gluster/gluster-server.vol
> volume posix
>     type storage/posix
>     option directory /gfsvol/data
> end-volume
>
> volume plocks
>   type features/posix-locks
>   subvolumes posix
> end-volume
>
> volume writebehind
>   type performance/write-behind
>   option flush-behind off    # default is 'off'
>   subvolumes plocks
> end-volume
>
> volume readahead
>   type performance/read-ahead
>   option page-size 128kB        # 256KB is the default option
>   option page-count 4           # 2 is default option
>   option force-atime-update off # default is off
>   subvolumes writebehind
> end-volume
>
> volume brick
>   type performance/io-threads
>   option thread-count 4  # deault is 1
>   option cache-size 64MB #64MB
>   subvolumes readahead
> end-volume
>
> volume server
>     type protocol/server
>     option transport-type tcp/server
>     subvolumes brick
>     option auth.ip.brick.allow 10.8.1.*,127.0.0.1
> end-volume
>
>
> ===/etc/glusterfs/gluster-client.vol
>
> volume brick1
>     type protocol/client
>     option transport-type tcp/client # for TCP/IP transport
>     option remote-host 10.8.1.9   # IP address of server1
>     option remote-subvolume brick    # name of the remote volume on
> server1
> end-volume
>
> volume brick2
>     type protocol/client
>     option transport-type tcp/client # for TCP/IP transport
>     option remote-host 10.8.1.10   # IP address of server2
>     option remote-subvolume brick    # name of the remote volume on
> server2
> end-volume
>
> volume afr
>    type cluster/afr
>    subvolumes brick1 brick2
> end-volume
>
> volume writebehind
>   type performance/write-behind
>   option aggregate-size 0MB
>   option flush-behind off    # default is 'off'
>   subvolumes afr
> end-volume
>
> volume readahead
>   type performance/read-ahead
>   option page-size 128kB        # 256KB is the default option
>   option page-count 4           # 2 is default option
>   option force-atime-update off # default is off
>   subvolumes writebehind
> end-volume
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
> !DSPAM:1,4899f27e222741195416303!
>





More information about the Gluster-users mailing list