[Gluster-users] very low file creation rate with glusterfs -- result updates

Mon Sep 14 06:40:02 UTC 2009

Wei Dong wrote:
> By using booster, I actually avoid being root on the client side. 
> It would be perfect if the servers can also be run by regular 
> users, even if that means that some features have to be deleted. 
> Can someone explain a little bit why the server side must be run by
>  root?

There are plenty of reasons why the FUSE approach needs to be run
as root but I am sure others more familiar with FUSE can do a far
better job of explaining exactly why.

With regards to booster, we do not need root since all
file system operations are basically being translated by
libglusterfsclient into network IO operations so the kernel's
file system API is almost completely bypassed.

Know that libglusterfsclient is the library used internally
by booster.

> 
> I know that I should not ask for too much when the robustness of 
> the current codebase is the most import issue at the time.  I just 
> want to hear a story about that and maybe hack the code myself.
> 
Please, dont hesitate to ask any question about Gluster. We'll try to
answer as well given the time and other constraints.

Thanks
-Shehjar

> - Wei
> 
> Wei Dong wrote:
>> I think it is fuse that causes the slowness.  I ran all 
>> experiments with booster enabled and here's the new figure: 
>> http://www.cs.princeton.edu/~wdong/gluster/summary-booster.gif . 
>> The numbers are MUCH better than NFS in most cases except for the
>>  local setting, which is not practically interesting.  The 
>> interesting thing is that all of a sudden, the deleting rate drop
>>  by 4-10 times -- though I don't really care about file deletion.
>> 
>> 
>> I must say that I'm totally satisfied by the results.
>> 
>> - Wei
>> 
>> 
>> Wei Dong wrote:
>>> Hi All,
>>> 
>>> I complained about the low file creation rate with the 
>>> glusterfs on my cluster weeks ago and Avati suggested I started
>>>  with a small number of nodes.  I finally get sometime to 
>>> seriously benchmark glusterfs with Bonnie++ today and the 
>>> results confirms that glusterfs is indeed slow in terms of file
>>>  creating.  My application is to store a large number of ~200KB
>>>  image files.  I use the following bonnie++ command for 
>>> evaluation (create 10K files of 200KiB each scattered under 100
>>>  directories):
>>> 
>>> bonnie++ -d . -s 0 -n 10:200000:200000:100
>>> 
>>> Since sequential I/O is not that interesting to me, I only keep
>>>  the random I/O results.
>>> 
>>> My hardware configuration is 2xquadcore Xeon E5430 2.66GHz, 
>>> 16GB memory, 4 x Seagate 1500GiB 7200RPM hard drive.  The 
>>> machines are connected with gigabit ethernet.
>>> 
>>> I ran several GlusterFS configurations, each named as N-R-T, 
>>> where N is the number of replicated volumes aggregated, R is 
>>> the number of replications and T is number of server side I/O 
>>> thread.  I use one machine to serve one volume so there are NxR
>>>  servers and one separate client running for each experiment. 
>>> On the client side, the server volumes are first replicated and
>>>  then aggregated -- even with 1-1-2 configuration, the single 
>>> volume is wrapped by a replicate and a distribute translator. 
>>> To show the overhead of those translators, I also run a 
>>> "simple" configuration which is 1-1-2 without the extra 
>>> replicate & distribute translators, and a "local" configuration
>>>  which is "simple" with client & server running on the same 
>>> machine.  These configurations are compared to "nfs" and 
>>> "nfs-local", which is NFS with server and client on the same 
>>> machine.  The GlusterFS volume file templates are attached to 
>>> the email.
>>> 
>>> The result is at 
>>> http://www.cs.princeton.edu/~wdong/gluster/summary.gif .  The 
>>> bars/numbers shown are operations/second, so the larger the 
>>> better.
>>> 
>>> Following are the messages shown by the figure: 1.  GlusterFS 
>>> is doing a exceptionally good job on deleting files, but 
>>> creates and reads files much slower than both NFS. 2.  At least
>>>  for one node server configuration, network doesn't affects the
>>>  file creation rate and does affects file read rate. 3.  The 
>>> extra dummy replicate & distribute translators lowers file 
>>> creation rate by almost half. 4.  Replication doesn't hurt 
>>> performance a lot. 5.  I'm running only single-threaded 
>>> benchmark, so it's hard to say about scalability, but adding 
>>> more servers does helps a little bit even in single-threaded 
>>> setting.
>>> 
>>> Note that my results are not really that different from 
>>> http://gluster.com/community/documentation/index.php/GlusterFS_2.0_I/O_Benchmark_Results,
>>>  where the single node configuration file create rate is about 
>>> 30/second.
>>> 
>>> I see no reason why GlusterFS has to be that slower than NFS in
>>>  file creation in single node configuration.  I'm wondering if 
>>> someone here can help me figure out what's wrong in my 
>>> configuration or what's wrong in the GlusterFS implementation.
>>> 
>>> - Wei
>>> 
>>> Server volume:
>>> 
>>> volume posix type storage/posix option directory 
>>> /state/partition1/wdong/gluster end-volume
>>> 
>>> volume lock type features/locks subvolumes posix end-volume
>>> 
>>> volume brick type performance/io-threads option thread-count 2
>>>  subvolumes lock end-volume
>>> 
>>> volume server type protocol/server option transport-type tcp 
>>> option auth.addr.brick.allow 192.168.99.* option 
>>> transport.socket.listen-port 6999 subvolumes brick end-volume
>>> 
>>> 
>>> Client volume
>>> 
>>> volume brick-0-0 type protocol/client option transport-type tcp
>>>  option remote-host c8-0-0 option remote-port 6999 option 
>>> remote-subvolume brick end-volume
>>> 
>>> volume brick-0-1 ...
>>> 
>>> volume rep-0 type cluster/replicate subvolumes brick-0-0 
>>> brick-0-1 ...
>>> 
>>> ... volume union type cluster/distribute subvolumes rep-0 rep-1
>>>  rep-2 rep-3 rep-4 rep-5 rep-6 rep-7 end-volume
>>> 
>>> volume client type performance/write-behind option cache-size 
>>> 32MB option flush-behind on subvolumes union end-volume
>>> 
>>> 
>>> For those who are interested enough to see the real 
>>> configuration files, I have all the configuration files and 
>>> server/client logs uploaded to 
>>> http://www.cs.princeton.edu/~wdong/gluster/run.tar.gz .
>>> 
>> 
>> 
> 
> _______________________________________________ Gluster-users 
> mailing list Gluster-users at gluster.org 
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users