[Gluster-devel] io-cache test results and a segmentation fault with version from July 30th

Anand Avati avati at zresearch.com
Tue Jul 31 15:15:03 UTC 2007


Bernhard,
 The options for io-cache are 'page-size' and 'cache-size' (not page-count).
You specify a total cache size (say 128MB). the page-size specifies what is
the optimum break-up size of the total cache-size. usually 128KB is optimum.
if a file is smaller than a page, only the file size is counted towards
cache-size and not the rounded page size. Apologies for the documentation
being out of sync, they will be updated at the earliest.
 Following versions we plan to support per file page-size (like priority)
where you can set a big page size for very big files which makes page search
within an inode very fast (still discussing possible pitfalls).
 About the purging algo, currently there are a few enhancments possible in
the implementation. The enhancements are already on the way (will be
committed in an hour). The next commit of io-cache should work pretty light
on CPU for you when lot of files are cached.
  About the segfault you hit, this seems to be caused during startup
(glusterfs -s <servername>) and not during runtime of the filesystem. I have
just committed a fix for it.
 Thanks for you valuable feedback! We'r interested in getting io-cache work
best for you :)

avati

2007/7/31, Bernhard J. M. Grün <bernhard.gruen at googlemail.com>:
>
> Hi!
>
> After some tests with the newest version of glusterfs. We have some
> testing results.
> The new version is a lot faster (even without io-cache). The
> throughput is about 1.5 times higher (from 20MBit output to 32MBit
> output) than with the old version we used before.
> We were note able to use io-cache because the glusterfs process then
> uses more and more cpu power and the system is therefore slower than
> without io-cache. It seems that the purging algorithm in io-cache is
> not optimal. Maybe it is called too often or the underlaying data
> structure for the cache is not optimal. We tried different settings
> for page-count and page-size. Like page-count 32 and page-size 64MB or
> page-count 128 and page-size 1MB. After some time the cache seems to
> be full and from now on the purge algorithm seems to work without
> sleep. Maybe something like an upper and a lower threshold could solve
> that problem. This is also done in squid. It purges the cache, when it
> is over about 90% of full size and purges it as long as the size is
> larger than 75% of the full size.
>
> Another problem we discovered was the following segfault after some
> time. I just can't explain the reason for it. It seems to be load
> dependant.
>
> #0  0x00002b5f8c363eb2 in pthread_spin_lock () from /lib/libpthread.so.0
> #1  0x00000000004094fb in fetch_cbk (frame=<value optimized out>,
>     cookie=0x60c3e0, this=0x60ff80, op_ret=0, op_errno=6341600,
>     spec_data=0x6107fc "### file: client-volume.spec.sample\n\n", '#'
> <repeats 46 times>, "\n###  GlusterFS Client Volume Specification
> ##\n", '#' <repeats 46 times>, "\n\n#### CONFIG FILE RULE"...) at
> ../../libglusterfs/src/stack.h:99
> #2  0x00002b5f8cad376b in client_getspec_cbk (frame=0x60fce0,
> args=0x60fda0)
>     at client-protocol.c:4058
> #3  0x00002b5f8cad77b8 in notify (this=0x60c300, event=<value optimized
> out>,
>     data=0x60d1e0) at client-protocol.c:4364
> #4  0x00002b5f8bd2f4f2 in sys_epoll_iteration (ctx=<value optimized out>)
>     at epoll.c:53
> #5  0x0000000000409400 in fetch_spec (ctx=0x7fff1efa4900,
>     remote_host=<value optimized out>, remote_port=0x60c3c0 "9999",
>     transport=<value optimized out>) at fetch-spec.c:131
> #6  0x000000000040329d in main (argc=5, argv=0x7fff1efa4a78) at
> glusterfs.c:128
>
> The config we used in that case is the following:
> for the server:
> volume brick
>   type storage/posix                   # POSIX FS translator
>   option directory /media/storage       # Export this directory
> end-volume
>
> volume iothreads    #iothreads can give performance a boost
>    type performance/io-threads
>    option thread-count 16
>    subvolumes brick
> end-volume
>
> ### Add network serving capability to above brick.
> volume server
>   type protocol/server
>   option transport-type tcp/server     # For TCP/IP transport
>   option listen-port 9999              # Default is 6996
>   option client-volume-filename
> /opt/glusterfs-1.3.0/etc/glusterfs/client.vol
>   subvolumes iothreads
>   option auth.ip.iothreads.allow * # Allow access to "brick" volume
> end-volume
>
> for the client:
> volume client1
>   type protocol/client
>   option transport-type tcp/client     # for TCP/IP transport
>   option remote-host 10.1.1.13     # IP address of the remote brick
>   option remote-port 9999              # default server port is 6996
>   option remote-subvolume iothreads        # name of the remote volume
> end-volume
>
> ### Add client feature and attach to remote subvolume
> volume client2
>   type protocol/client
>   option transport-type tcp/client     # for TCP/IP transport
>   option remote-host 10.1.1.14     # IP address of the remote brick
>   option remote-port 9999              # default server port is 6996
>   option remote-subvolume iothreads        # name of the remote volume
> end-volume
>
> volume afrbricks
>   type cluster/afr
>   subvolumes client1 client2
>   option replicate *:2
>   option self-heal off
> end-volume
>
> volume iothreads    #iothreads can give performance a boost
>    type performance/io-threads
>    option thread-count 16
>    subvolumes afrbricks
> end-volume
>
> ### Add writeback feature
> volume writeback
>   type performance/write-behind
>   option aggregate-size 0  # unit in bytes
>   subvolumes iothreads
> end-volume
>
> #volume bricks
> #  type performance/io-cache
> #  option page-count 128  #128 is default option
> #  option page-size 128KB  #128KB is default option
> #  option priority */imagescache/galerie/*.jpg:3,*3.jpg:2,*0.jpg:1
> #  subvolumes writeback
> #end-volume
>
> Again mainy thanks for your help!
>
> Cheers,
>
> Bernhard J. M. Grün
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>



-- 
Anand V. Avati



More information about the Gluster-devel mailing list