[Gluster-devel] Speeding up file creation

Erik Osterman e at osterman.com
Mon Mar 12 18:30:11 UTC 2007


Thank you Julien, that's very useful for comparison. I should have tried 
that myself first. So, it would appear that with AFR enabled, the 
transaction is not complete until the entire request (replicating the 
file to the mirror site) is completed. This makes sense from a 
consistency perspective; however, has there been any talk of adding an 
option of "lazy" AFR, where writes to the mirror site would not block 
the transaction? We are not overly concerned with this level of 
consistency and can periodically rsync the data, or when the f* 
utilities some out, run some sort of repair.

Best,

Erik Osterman

Julien Perez wrote:
> Hello,
>
> By looking at your configuration, my first guess is that you have 
> latency issues in your network, which would definitely explain that 
> awful performances. So using your configuration files, I've set up the 
> same architecture, but locally (one computer running on Sempron 2600+ 
> with 2 hitachi SATA1 disks in RAID1 software) and here are the results 
> I got:
>
> toad at web1:~/afr$ time for i in {1..100}; do touch ./mnt/test.$i; done
>
> real    0m0.560s
> user    0m0.160s
> sys     0m0.190s
> toad at web1:~/afr$ rm -f ./mnt/*
> toad at web1:~/afr$ time for i in {1..10000}; do touch ./mnt/test.$i; done
>
> real    1m8.060s
> user    0m16.680s
> sys     0m18.180s
> toad at web1:~/afr$ find ./mnt/ -type f | xargs -n100 rm -f
> toad at web1:~/afr$ time for i in {1..1000}; do touch ./mnt/test.$i; done
>
> real    0m5.829s
> user    0m1.670s
> sys     0m1.910s
>
>
> So my advise would be: check your network :)
>
> Hope it helped,
>
> Have a nice day everyone
>
> Julien Perez
>
>
> On 3/12/07, *Erik Osterman* <e at osterman.com <mailto:e at osterman.com>> 
> wrote:
>
>     I've configured a cluster with replication that uses most of the
>     advanced features you've implemented including io-threads, afr,
>     readahead, and writebehind. I am very satisfied with the write
>     performance, but the file creation performance leaves much to be
>     desired. What can we do to speed this up?
>
>     Creating 100 empty files
>
>     # time for i in {1..100}; do touch test.$i;done
>     real    0m46.913s
>     user    0m0.023s
>     sys     0m0.067s
>
>     That's about 0.500 seconds just to create an empty file.
>
>
>     In general, what do you advise for tuning the performance of
>     reading/writing tons of tiny files. Can the client use io-threads to
>     improve performance? Right now, our application stuffs all the tiny
>     files in a single directory. Eventually, we were planning on hashing
>     them out to directories. Would hashing them out into multiple
>     directories positively and significantly affect the performance of
>     GlusterFS?
>
>
>
>     Best,
>
>     Erik Osterman
>
>
>
>
>
>     For what it's worth, here are my configurations:
>
>     #
>     # Master
>     #
>     volume posix0
>       type storage/posix                   # POSIX FS translator
>       option directory /home/glusterfs        # Export this directory
>     end-volume
>
>     volume brick0
>       type performance/io-threads
>       option thread-count 8
>       option queue-limit 1024
>       subvolumes posix0
>     end-volume
>
>     ### Add network serving capability to above brick.
>     volume server
>       type protocol/server
>       option transport-type tcp/server     # For TCP/IP transport
>     # option bind-address 192.168.1.10 <http://192.168.1.10>     #
>     Default is to listen on all
>     interfaces
>       option listen-port 6996               # Default is 6996
>       option client-volume-filename /etc/glusterfs/client.vol
>       subvolumes brick0
>       option auth.ip.brick0.allow *         # access to "brick" volume
>     end-volume
>
>
>
>     #
>     # Mirror
>     #
>     volume posix0
>       type storage/posix                   # POSIX FS translator
>       option directory /home/glusterfs     # Export this directory
>     end-volume
>
>     volume mirror0
>       type performance/io-threads
>       option thread-count 8
>       option queue-limit 1024
>       subvolumes posix0
>     end-volume
>
>     ### Add network serving capability to above brick.
>     volume server
>       type protocol/server
>       option transport-type tcp/server     # For TCP/IP transport
>     # option bind-address 192.168.1.11 <http://192.168.1.11>     #
>     Default is to listen on all
>     interfaces
>       option listen-port 6996               # Default is 6996
>       option client-volume-filename /etc/glusterfs/client.vol
>       subvolumes mirror0
>       option auth.ip.mirror0.allow *         # access to "brick" volume
>     end-volume
>
>
>     #
>     # Client
>     #
>
>     ### Add client feature and attach to remote subvolume of server
>     volume brick0
>       type protocol/client
>       option transport-type tcp/client     # for TCP/IP transport
>       option remote-host 216.182.237.155 <http://216.182.237.155>   #
>     IP address of the remote brick
>     server
>       option remote-port 6996              # default server port is 6996
>       option remote-subvolume brick0        # name of the remote volume
>     end-volume
>
>     ### Add client feature and attach to remote mirror of brick0
>     volume mirror0
>       type protocol/client
>       option transport-type tcp/client     # for TCP/IP transport
>       option remote-host 216.55.170.26 <http://216.55.170.26>      #
>     IP address of the remote
>     mirror server
>       option remote-port 6996              # default server port is 6996
>       option remote-subvolume mirror0        # name of the remote volume
>     end-volume
>
>     ### Add AFR feature to brick
>     volume afr0
>       type cluster/afr
>       subvolumes brick0 mirror0
>       option replicate *:2                 # All files 2 copies (RAID-1)
>     end-volume
>
>     ### Add unify feature to cluster the servers. Associate an
>     ### appropriate scheduler that matches your I/O demand.
>     volume bricks
>       type cluster/unify
>       subvolumes afr0
>     ### ** Round Robin (RR) Scheduler **
>       option scheduler rr
>       option rr.limits.min-free-disk 2GB
>     end-volume
>
>     ### Add performance feature
>     volume writebehind
>       type performance/write-behind
>       option aggregate-size 131072 # aggregate block size in bytes
>       subvolumes bricks
>     end-volume
>
>     ### Add performance feature
>     volume readahead
>       type performance/read-ahead
>       option page-size 131072
>       option page-count 16
>       subvolumes writebehind
>     end-volume
>
>
>
>
>
>
>     _______________________________________________
>     Gluster-devel mailing list
>     Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
>     http://lists.nongnu.org/mailman/listinfo/gluster-devel
>     <http://lists.nongnu.org/mailman/listinfo/gluster-devel>
>
>






More information about the Gluster-devel mailing list