[Gluster-devel] Speeding up file creation

Krishna Srinivas krishna at zresearch.com
Mon Mar 12 18:41:58 UTC 2007


Erik,
check out write behind translator:
http://www.gluster.org/docs/index.php/GlusterFS_User_Guide#Write_Behind_Translator
Krishna

On 3/13/07, Erik Osterman <e at osterman.com> wrote:
> Thank you Julien, that's very useful for comparison. I should have tried
> that myself first. So, it would appear that with AFR enabled, the
> transaction is not complete until the entire request (replicating the
> file to the mirror site) is completed. This makes sense from a
> consistency perspective; however, has there been any talk of adding an
> option of "lazy" AFR, where writes to the mirror site would not block
> the transaction? We are not overly concerned with this level of
> consistency and can periodically rsync the data, or when the f*
> utilities some out, run some sort of repair.
>
> Best,
>
> Erik Osterman
>
> Julien Perez wrote:
> > Hello,
> >
> > By looking at your configuration, my first guess is that you have
> > latency issues in your network, which would definitely explain that
> > awful performances. So using your configuration files, I've set up the
> > same architecture, but locally (one computer running on Sempron 2600+
> > with 2 hitachi SATA1 disks in RAID1 software) and here are the results
> > I got:
> >
> > toad at web1:~/afr$ time for i in {1..100}; do touch ./mnt/test.$i; done
> >
> > real    0m0.560s
> > user    0m0.160s
> > sys     0m0.190s
> > toad at web1:~/afr$ rm -f ./mnt/*
> > toad at web1:~/afr$ time for i in {1..10000}; do touch ./mnt/test.$i; done
> >
> > real    1m8.060s
> > user    0m16.680s
> > sys     0m18.180s
> > toad at web1:~/afr$ find ./mnt/ -type f | xargs -n100 rm -f
> > toad at web1:~/afr$ time for i in {1..1000}; do touch ./mnt/test.$i; done
> >
> > real    0m5.829s
> > user    0m1.670s
> > sys     0m1.910s
> >
> >
> > So my advise would be: check your network :)
> >
> > Hope it helped,
> >
> > Have a nice day everyone
> >
> > Julien Perez
> >
> >
> > On 3/12/07, *Erik Osterman* <e at osterman.com <mailto:e at osterman.com>>
> > wrote:
> >
> >     I've configured a cluster with replication that uses most of the
> >     advanced features you've implemented including io-threads, afr,
> >     readahead, and writebehind. I am very satisfied with the write
> >     performance, but the file creation performance leaves much to be
> >     desired. What can we do to speed this up?
> >
> >     Creating 100 empty files
> >
> >     # time for i in {1..100}; do touch test.$i;done
> >     real    0m46.913s
> >     user    0m0.023s
> >     sys     0m0.067s
> >
> >     That's about 0.500 seconds just to create an empty file.
> >
> >
> >     In general, what do you advise for tuning the performance of
> >     reading/writing tons of tiny files. Can the client use io-threads to
> >     improve performance? Right now, our application stuffs all the tiny
> >     files in a single directory. Eventually, we were planning on hashing
> >     them out to directories. Would hashing them out into multiple
> >     directories positively and significantly affect the performance of
> >     GlusterFS?
> >
> >
> >
> >     Best,
> >
> >     Erik Osterman
> >
> >
> >
> >
> >
> >     For what it's worth, here are my configurations:
> >
> >     #
> >     # Master
> >     #
> >     volume posix0
> >       type storage/posix                   # POSIX FS translator
> >       option directory /home/glusterfs        # Export this directory
> >     end-volume
> >
> >     volume brick0
> >       type performance/io-threads
> >       option thread-count 8
> >       option queue-limit 1024
> >       subvolumes posix0
> >     end-volume
> >
> >     ### Add network serving capability to above brick.
> >     volume server
> >       type protocol/server
> >       option transport-type tcp/server     # For TCP/IP transport
> >     # option bind-address 192.168.1.10 <http://192.168.1.10>     #
> >     Default is to listen on all
> >     interfaces
> >       option listen-port 6996               # Default is 6996
> >       option client-volume-filename /etc/glusterfs/client.vol
> >       subvolumes brick0
> >       option auth.ip.brick0.allow *         # access to "brick" volume
> >     end-volume
> >
> >
> >
> >     #
> >     # Mirror
> >     #
> >     volume posix0
> >       type storage/posix                   # POSIX FS translator
> >       option directory /home/glusterfs     # Export this directory
> >     end-volume
> >
> >     volume mirror0
> >       type performance/io-threads
> >       option thread-count 8
> >       option queue-limit 1024
> >       subvolumes posix0
> >     end-volume
> >
> >     ### Add network serving capability to above brick.
> >     volume server
> >       type protocol/server
> >       option transport-type tcp/server     # For TCP/IP transport
> >     # option bind-address 192.168.1.11 <http://192.168.1.11>     #
> >     Default is to listen on all
> >     interfaces
> >       option listen-port 6996               # Default is 6996
> >       option client-volume-filename /etc/glusterfs/client.vol
> >       subvolumes mirror0
> >       option auth.ip.mirror0.allow *         # access to "brick" volume
> >     end-volume
> >
> >
> >     #
> >     # Client
> >     #
> >
> >     ### Add client feature and attach to remote subvolume of server
> >     volume brick0
> >       type protocol/client
> >       option transport-type tcp/client     # for TCP/IP transport
> >       option remote-host 216.182.237.155 <http://216.182.237.155>   #
> >     IP address of the remote brick
> >     server
> >       option remote-port 6996              # default server port is 6996
> >       option remote-subvolume brick0        # name of the remote volume
> >     end-volume
> >
> >     ### Add client feature and attach to remote mirror of brick0
> >     volume mirror0
> >       type protocol/client
> >       option transport-type tcp/client     # for TCP/IP transport
> >       option remote-host 216.55.170.26 <http://216.55.170.26>      #
> >     IP address of the remote
> >     mirror server
> >       option remote-port 6996              # default server port is 6996
> >       option remote-subvolume mirror0        # name of the remote volume
> >     end-volume
> >
> >     ### Add AFR feature to brick
> >     volume afr0
> >       type cluster/afr
> >       subvolumes brick0 mirror0
> >       option replicate *:2                 # All files 2 copies (RAID-1)
> >     end-volume
> >
> >     ### Add unify feature to cluster the servers. Associate an
> >     ### appropriate scheduler that matches your I/O demand.
> >     volume bricks
> >       type cluster/unify
> >       subvolumes afr0
> >     ### ** Round Robin (RR) Scheduler **
> >       option scheduler rr
> >       option rr.limits.min-free-disk 2GB
> >     end-volume
> >
> >     ### Add performance feature
> >     volume writebehind
> >       type performance/write-behind
> >       option aggregate-size 131072 # aggregate block size in bytes
> >       subvolumes bricks
> >     end-volume
> >
> >     ### Add performance feature
> >     volume readahead
> >       type performance/read-ahead
> >       option page-size 131072
> >       option page-count 16
> >       subvolumes writebehind
> >     end-volume
> >
> >
> >
> >
> >
> >
> >     _______________________________________________
> >     Gluster-devel mailing list
> >     Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
> >     http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >     <http://lists.nongnu.org/mailman/listinfo/gluster-devel>
> >
> >
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>





More information about the Gluster-devel mailing list