[Gluster-users] very low file creation rate with glusterfs
Wei Dong
wdong.pku at gmail.com
Thu Sep 10 18:08:26 UTC 2009
The glusterfs version I'm using is 2.0.6.
- Wei
On Thu, Sep 10, 2009 at 2:05 PM, Wei Dong <wdong.pku at gmail.com> wrote:
> Hi All,
>
> I complained about the low file creation rate with the glusterfs on my
> cluster weeks ago and Avati suggested I started with a small number of
> nodes. I finally get sometime to seriously benchmark glusterfs with
> Bonnie++ today and the results confirms that glusterfs is indeed slow in
> terms of file creating. My application is to store a large number of ~200KB
> image files. I use the following bonnie++ command for evaluation (create
> 10K files of 200KiB each scattered under 100 directories):
>
> bonnie++ -d . -s 0 -n 10:200000:200000:100
>
> Since sequential I/O is not that interesting to me, I only keep the random
> I/O results.
>
> My hardware configuration is 2xquadcore Xeon E5430 2.66GHz, 16GB memory, 4
> x Seagate 1500GiB 7200RPM hard drive. The machines are connected with
> gigabit ethernet.
>
> I ran several GlusterFS configurations, each named as N-R-T, where N is the
> number of replicated volumes aggregated, R is the number of replications and
> T is number of server side I/O thread. I use one machine to serve one
> volume so there are NxR servers and one separate client running for each
> experiment. On the client side, the server volumes are first replicated and
> then aggregated -- even with 1-1-2 configuration, the single volume is
> wrapped by a replicate and a distribute translator. To show the overhead of
> those translators, I also run a "simple" configuration which is 1-1-2
> without the extra replicate & distribute translators, and a "local"
> configuration which is "simple" with client & server running on the same
> machine. These configurations are compared to "nfs" and "nfs-local", which
> is NFS with server and client on the same machine. The GlusterFS volume
> file templates are attached to the email.
>
> The result is at http://www.cs.princeton.edu/~wdong/gluster/summary.gif<http://www.cs.princeton.edu/%7Ewdong/gluster/summary.gif>. The bars/numbers shown are operations/second, so the larger the better.
>
> Following are the messages shown by the figure:
> 1. GlusterFS is doing a exceptionally good job on deleting files, but
> creates and reads files much slower than both NFS.
> 2. At least for one node server configuration, network doesn't affects the
> file creation rate and does affects file read rate.
> 3. The extra dummy replicate & distribute translators lowers file creation
> rate by almost half. 4. Replication doesn't hurt performance a lot.
> 5. I'm running only single-threaded benchmark, so it's hard to say about
> scalability, but adding more servers does helps a little bit even in
> single-threaded setting.
>
> Note that my results are not really that different from
> http://gluster.com/community/documentation/index.php/GlusterFS_2.0_I/O_Benchmark_Results,
> where the single node configuration file create rate is about 30/second.
>
>
> I see no reason why GlusterFS has to be that slower than NFS in file
> creation in single node configuration. I'm wondering if someone here can
> help me figure out what's wrong in my configuration or what's wrong in the
> GlusterFS implementation.
>
> - Wei
>
> Server volume:
>
> volume posix
> type storage/posix
> option directory /state/partition1/wdong/gluster
> end-volume
>
> volume lock
> type features/locks
> subvolumes posix
> end-volume
>
> volume brick
> type performance/io-threads
> option thread-count 2
> subvolumes lock
> end-volume
>
> volume server
> type protocol/server
> option transport-type tcp
> option auth.addr.brick.allow 192.168.99.*
> option transport.socket.listen-port 6999
> subvolumes brick
> end-volume
>
>
> Client volume
>
> volume brick-0-0
> type protocol/client
> option transport-type tcp
> option remote-host c8-0-0
> option remote-port 6999
> option remote-subvolume brick
> end-volume
>
> volume brick-0-1 ...
>
> volume rep-0
> type cluster/replicate
> subvolumes brick-0-0 brick-0-1 ...
>
> ...
> volume union
> type cluster/distribute
> subvolumes rep-0 rep-1 rep-2 rep-3 rep-4 rep-5 rep-6 rep-7
> end-volume
>
> volume client
> type performance/write-behind
> option cache-size 32MB
> option flush-behind on
> subvolumes union
> end-volume
>
>
> For those who are interested enough to see the real configuration files, I
> have all the configuration files and server/client logs uploaded to
> http://www.cs.princeton.edu/~wdong/gluster/run.tar.gz<http://www.cs.princeton.edu/%7Ewdong/gluster/run.tar.gz>.
>
>
More information about the Gluster-users
mailing list