[Gluster-devel] Speeding up file creation
Krishna Srinivas
krishna at zresearch.com
Mon Mar 12 18:41:58 UTC 2007
Erik,
check out write behind translator:
http://www.gluster.org/docs/index.php/GlusterFS_User_Guide#Write_Behind_Translator
Krishna
On 3/13/07, Erik Osterman <e at osterman.com> wrote:
> Thank you Julien, that's very useful for comparison. I should have tried
> that myself first. So, it would appear that with AFR enabled, the
> transaction is not complete until the entire request (replicating the
> file to the mirror site) is completed. This makes sense from a
> consistency perspective; however, has there been any talk of adding an
> option of "lazy" AFR, where writes to the mirror site would not block
> the transaction? We are not overly concerned with this level of
> consistency and can periodically rsync the data, or when the f*
> utilities some out, run some sort of repair.
>
> Best,
>
> Erik Osterman
>
> Julien Perez wrote:
> > Hello,
> >
> > By looking at your configuration, my first guess is that you have
> > latency issues in your network, which would definitely explain that
> > awful performances. So using your configuration files, I've set up the
> > same architecture, but locally (one computer running on Sempron 2600+
> > with 2 hitachi SATA1 disks in RAID1 software) and here are the results
> > I got:
> >
> > toad at web1:~/afr$ time for i in {1..100}; do touch ./mnt/test.$i; done
> >
> > real 0m0.560s
> > user 0m0.160s
> > sys 0m0.190s
> > toad at web1:~/afr$ rm -f ./mnt/*
> > toad at web1:~/afr$ time for i in {1..10000}; do touch ./mnt/test.$i; done
> >
> > real 1m8.060s
> > user 0m16.680s
> > sys 0m18.180s
> > toad at web1:~/afr$ find ./mnt/ -type f | xargs -n100 rm -f
> > toad at web1:~/afr$ time for i in {1..1000}; do touch ./mnt/test.$i; done
> >
> > real 0m5.829s
> > user 0m1.670s
> > sys 0m1.910s
> >
> >
> > So my advise would be: check your network :)
> >
> > Hope it helped,
> >
> > Have a nice day everyone
> >
> > Julien Perez
> >
> >
> > On 3/12/07, *Erik Osterman* <e at osterman.com <mailto:e at osterman.com>>
> > wrote:
> >
> > I've configured a cluster with replication that uses most of the
> > advanced features you've implemented including io-threads, afr,
> > readahead, and writebehind. I am very satisfied with the write
> > performance, but the file creation performance leaves much to be
> > desired. What can we do to speed this up?
> >
> > Creating 100 empty files
> >
> > # time for i in {1..100}; do touch test.$i;done
> > real 0m46.913s
> > user 0m0.023s
> > sys 0m0.067s
> >
> > That's about 0.500 seconds just to create an empty file.
> >
> >
> > In general, what do you advise for tuning the performance of
> > reading/writing tons of tiny files. Can the client use io-threads to
> > improve performance? Right now, our application stuffs all the tiny
> > files in a single directory. Eventually, we were planning on hashing
> > them out to directories. Would hashing them out into multiple
> > directories positively and significantly affect the performance of
> > GlusterFS?
> >
> >
> >
> > Best,
> >
> > Erik Osterman
> >
> >
> >
> >
> >
> > For what it's worth, here are my configurations:
> >
> > #
> > # Master
> > #
> > volume posix0
> > type storage/posix # POSIX FS translator
> > option directory /home/glusterfs # Export this directory
> > end-volume
> >
> > volume brick0
> > type performance/io-threads
> > option thread-count 8
> > option queue-limit 1024
> > subvolumes posix0
> > end-volume
> >
> > ### Add network serving capability to above brick.
> > volume server
> > type protocol/server
> > option transport-type tcp/server # For TCP/IP transport
> > # option bind-address 192.168.1.10 <http://192.168.1.10> #
> > Default is to listen on all
> > interfaces
> > option listen-port 6996 # Default is 6996
> > option client-volume-filename /etc/glusterfs/client.vol
> > subvolumes brick0
> > option auth.ip.brick0.allow * # access to "brick" volume
> > end-volume
> >
> >
> >
> > #
> > # Mirror
> > #
> > volume posix0
> > type storage/posix # POSIX FS translator
> > option directory /home/glusterfs # Export this directory
> > end-volume
> >
> > volume mirror0
> > type performance/io-threads
> > option thread-count 8
> > option queue-limit 1024
> > subvolumes posix0
> > end-volume
> >
> > ### Add network serving capability to above brick.
> > volume server
> > type protocol/server
> > option transport-type tcp/server # For TCP/IP transport
> > # option bind-address 192.168.1.11 <http://192.168.1.11> #
> > Default is to listen on all
> > interfaces
> > option listen-port 6996 # Default is 6996
> > option client-volume-filename /etc/glusterfs/client.vol
> > subvolumes mirror0
> > option auth.ip.mirror0.allow * # access to "brick" volume
> > end-volume
> >
> >
> > #
> > # Client
> > #
> >
> > ### Add client feature and attach to remote subvolume of server
> > volume brick0
> > type protocol/client
> > option transport-type tcp/client # for TCP/IP transport
> > option remote-host 216.182.237.155 <http://216.182.237.155> #
> > IP address of the remote brick
> > server
> > option remote-port 6996 # default server port is 6996
> > option remote-subvolume brick0 # name of the remote volume
> > end-volume
> >
> > ### Add client feature and attach to remote mirror of brick0
> > volume mirror0
> > type protocol/client
> > option transport-type tcp/client # for TCP/IP transport
> > option remote-host 216.55.170.26 <http://216.55.170.26> #
> > IP address of the remote
> > mirror server
> > option remote-port 6996 # default server port is 6996
> > option remote-subvolume mirror0 # name of the remote volume
> > end-volume
> >
> > ### Add AFR feature to brick
> > volume afr0
> > type cluster/afr
> > subvolumes brick0 mirror0
> > option replicate *:2 # All files 2 copies (RAID-1)
> > end-volume
> >
> > ### Add unify feature to cluster the servers. Associate an
> > ### appropriate scheduler that matches your I/O demand.
> > volume bricks
> > type cluster/unify
> > subvolumes afr0
> > ### ** Round Robin (RR) Scheduler **
> > option scheduler rr
> > option rr.limits.min-free-disk 2GB
> > end-volume
> >
> > ### Add performance feature
> > volume writebehind
> > type performance/write-behind
> > option aggregate-size 131072 # aggregate block size in bytes
> > subvolumes bricks
> > end-volume
> >
> > ### Add performance feature
> > volume readahead
> > type performance/read-ahead
> > option page-size 131072
> > option page-count 16
> > subvolumes writebehind
> > end-volume
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> > <http://lists.nongnu.org/mailman/listinfo/gluster-devel>
> >
> >
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list