[Gluster-devel] Performance problems in our web server setup

Anand Avati avati at zresearch.com
Wed Jul 25 17:27:18 UTC 2007


Bernhard,
  Some updates since my last mail. Based on the discussion we have had, we
felt it was very appropriate for the user to specify priorities to io-cache.
The latest TLA checkout of io-cache supports priority specification like:

'option priority */global/*.jpg:3,*/thumbnails/*.jpg:2,*.html:1'

etc. As a small insight into what they mean

- content of a file is cached with the priority with which it is specifed in
the spec file. by default a file has priority '0'

- higher the priority value, more affinity it has to stay in cache.

- when the cache-size limit is reached, cache pruning is done starting with
the lowest priority. within a priority, files are maintained in an LRU list.
when all pages of a given priority are pruned only then the next higher
priority is considered.

It is advisable to use priority numbers starting 1, incrementing by 1. What
I mean is providing priorities like 100,10,1 for your three types is not
recommended, instead use 3,2,1.

I hope this comes of use for your setup.

thanks,
avati


2007/7/25, Bernhard J. M. Grün <bernhard.gruen at googlemail.com>:
>
> Anand,
>
> the amount of files is - at the moment - 13.000.000 image files. The
> size of our cluster is 2x10TB (via AFR, so 10TB usable). This means
> that we don't have enough of RAM. But we could use at most 2 GB of RAM
> on each server for caching purposes.
> Our small files consist of three different types: global galleries
> (like event photos), user galleries and group galleries. The global
> galleries are in my opinion the only files worth to cache and for
> these global galleries only the newest files are interesting. On the
> other hand there are two image sizes: thumbnails and full size images.
> It would be best to cache only the small thumbnails because these wont
> clutter the cache too much (size: about 4-5kb each).
> At the moment we are using a squid cache in front of our web server to
> do exactly this job but squid is not optimized for that much
> connections. Therefore the io-cache should be exacty the feature we
> want.
>
> Thanks for your great help!
>
> Bernhard
>
>
> 2007/7/24, Anand Avati <avati at zresearch.com>:
> > Bernhard,
> >   io-cache on client side gives the best performance. the newer versions
> > will have advantages in loading io-cache on server as well, but for now
> it
> > is designed for client side.
> >
> > As with the optimum size for io-cache, what is the total size of all the
> > combined images being served? would they all fit in your RAM? what is
> the
> > access pattern of these small files? how many bytes (of other files) are
> > generally accessed before a file is re-used? the io-cache currently uses
> an
> > LRU algorithm to age out old pages and files.
> >
> > If they all fit in your RAM, then give a cache-size of that size plus
> some
> > 10% extra slack. If they dont, let us continue the discussion based on
> the
> > access pattern (maybe your http access_log can give some kind of hints).
> >
> > If need be, it should be possible to add some extra code into io-cache
> to
> > forcefully 'pin' cache pages of, say, '*.jpg' etc.
> >
> > thanks,
> > avatai
> >
> >
> > 2007/7/24, Bernhard J. M. Grün < bernhard.gruen at googlemail.com>:
> > > Anand, Harris,
> > >
> > > Thank you for your help. We'll try to migrate to mainline--2.5 this
> > > night. I really hope that it helps to speed up our setup. I'll send
> > > you the new configuration and also some new throughput benchmarks
> > > after some tests with the new version.
> > >
> > > But first I have some questions to the io-cache feature. Would you
> > > suggest to use it on the server or on the client side or even on both
> > > sides? The servers have 8GB of RAM each and the clients have 4GB of
> > > RAM each. So what would you suggest as good values for cache size in
> > > our scenario?
> > >
> > > I really hope the switch from mainline--2.4 to mainline--2.5 works
> well.
> > >
> > > Many thanks again for your work!
> > >
> > > Bernhard J. M. Grün
> > >
> > > 2007/7/24, Anand Avati <avati at zresearch.com>:
> > > > Bernhard,
> > > >  Thanks for trying glusterfs! I have some questions/suggestions -
> > > >
> > > > 1. The read-ahead translator in glusterfs--mainline--2.4 used an
> 'always
> > > > aggressive' mode. Probably setting a lower page-count (2?) and a
> > page-size
> > > > of 131072 can help. If you are using gigabit ethernet, glusterfs can
> > peak
> > > > 1Gbps even without read-ahead. So you could infact try without
> > read-ahead as
> > > > well.
> > > >
> > > > 2. I would suggest you to try if the latest TLA on
> > glusterfs--mainline--2.5
> > > > works well for you, and if it does, use the io-cache translator on
> the
> > > > client side. For your scenario (serving lot of small files
> read-only)
> > > > io-cache should do a lot of good. If you can have a trial setup and
> see
> > how
> > > > well io-cache helps you, we will be very much in knowing your
> results
> > (and
> > > > if possible, some numbers)
> > > >
> > > > 3. Please try the patched fuse available at -
> > > >
> >
> http://ftp.zresearch.com/pub/gluster/glusterfs/fuse/fuse-2.7.0-glfs1.tar.gz
> > > >     This patched fuse greatly improves read performance, and we
> expect
> > it to
> > > > complement the io-cache feature very well.
> > > >
> > > > 4. About using multiple tcp connections, the load-balancer feature
> is in
> > our
> > > > roadmap where you can load balance over two network interfaces, or
> just
> > > > exploit multiple tcp connections over the same network interface.
> You
> > will
> > > > have to wait for the 1.4 release for this.
> > > >
> > > > thanks,
> > > > avati
> > > >
> > > > 2007/7/24, Bernhard J. M. Grün < bernhard.gruen at googlemail.com>:
> > > > >
> > > > > Hello!
> > > > >
> > > > > We experience some performance problems with our setup at the
> moment.
> > > > > And we would be happy if someone of you could help us out.
> > > > > This is our setup:
> > > > > Two clients connect to two servers that share the same data via
> AFR.
> > > > > The two servers hold about 13.000.000 smaller image files that are
> > > > > sent out to the web via the two clients.
> > > > > First I'll show you the configuration of the servers:
> > > > > volume brick
> > > > >   type storage/posix                   # POSIX FS translator
> > > > >   option directory /media/storage       # Export this directory
> > > > > end-volume
> > > > >
> > > > > volume iothreads    #iothreads can give performance a boost
> > > > >    type performance/io-threads
> > > > >    option thread-count 16
> > > > >    subvolumes brick
> > > > > end-volume
> > > > >
> > > > > ### Add network serving capability to above brick.
> > > > > volume server
> > > > >   type protocol/server
> > > > >   option transport-type tcp/server     # For TCP/IP transport
> > > > >   option listen-port 6996              # Default is 6996
> > > > >   option client-volume-filename
> > > > /opt/glusterfs/etc/glusterfs/client.vol
> > > > >   subvolumes iothreads
> > > > >   option auth.ip.iothreads.allow * # Allow access to "brick"
> volume
> > > > > end-volume
> > > > >
> > > > > Now the configuration of the clients:
> > > > > ### Add client feature and attach to remote subvolume
> > > > > volume client1
> > > > >   type protocol/client
> > > > >   option transport-type tcp/client     # for TCP/IP transport
> > > > >   option remote-host 10.1.1.13      # IP address of the remote
> brick
> > > > >   option remote-port 6996              # default server port is
> 6996
> > > > >   option remote-subvolume iothreads        # name of the remote
> volume
> > > > > end-volume
> > > > >
> > > > > ### Add client feature and attach to remote subvolume
> > > > > volume client2
> > > > >   type protocol/client
> > > > >   option transport-type tcp/client     # for TCP/IP transport
> > > > >   option remote-host 10.1.1.14     # IP address of the remote
> brick
> > > > >   option remote-port 6996              # default server port is
> 6996
> > > > >   option remote-subvolume iothreads        # name of the remote
> volume
> > > > > end-volume
> > > > >
> > > > > volume afrbricks
> > > > >   type cluster/afr
> > > > >   subvolumes client1 client2
> > > > >   option replicate *:2
> > > > > end-volume
> > > > >
> > > > > volume iothreads    #iothreads can give performance a boost
> > > > >    type performance/io-threads
> > > > >    option thread-count 8
> > > > >    subvolumes afrbricks
> > > > > end-volume
> > > > >
> > > > > ### Add writeback feature
> > > > > volume writeback
> > > > >   type performance/write-behind
> > > > >   option aggregate-size 0  # unit in bytes
> > > > >   subvolumes iothreads
> > > > > end-volume
> > > > >
> > > > > ### Add readahead feature
> > > > > volume bricks
> > > > >   type performance/read-ahead
> > > > >   option page-size 65536     # unit in bytes
> > > > >   option page-count 16       # cache per file  = (page-count x
> > page-size)
> > > > >   subvolumes writeback
> > > > > end-volume
> > > > >
> > > > > We use Lighttpd as web server to handle the web traffic and it
> seems
> > > > > that the image loading is quite slow. Also the used bandwidth
> between
> > > > > one client and its corresponding AFR-Server is low - about 12
> MBit/s
> > > > > over a 1 GBit line. So there must be a bottleneck in our
> > > > > configuration. Maybe you can help us.
> > > > > At the moment we are using 1.3.0 (mainline--2.4 patch-131). At the
> > > > > moment we can't easily switch to mainline--2.5 because the servers
> are
> > > > > under high load.
> > > > >
> > > > > We also have seen that each client uses only one connection to
> each
> > > > > server. In my opinion this means that the iothreads subvolume on
> the
> > > > > client is (nearly) useless. Wouldn't it be better to establish
> more
> > > > > than just one connection to each server?
> > > > >
> > > > > Many thanks in advance
> > > > >
> > > > > Bernhard J. M. Grün
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Gluster-devel mailing list
> > > > > Gluster-devel at nongnu.org
> > > > >
> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Anand V. Avati
> > >
> >
> >
> >
> > --
> > Anand V. Avati
>



-- 
Anand V. Avati



More information about the Gluster-devel mailing list