[Gluster-users] Performance translators - a overview.

Thu Jul 12 08:21:16 UTC 2012

HI,
    Is there a way to get good performance when an application does small
writes?
Most of the apllications using NetCDF write big files (upto 100GB) but
using small block-sized writes(Block size less than 1KB)

--------------------------------------------------------------------------------------------------
[root at cola5 scripts]# dd if=/dev/zero of=/h1/junk bs=512 count=1024000
1024000+0 records in
1024000+0 records out
524288000 bytes (524 MB) copied, 70.7706 seconds, 7.4 MB/s
[root at cola5 scripts]# dd if=/dev/zero of=/h1/junk bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 59.6961 seconds, 17.6 MB/s
[root at cola5 scripts]# dd if=/dev/zero of=/h1/junk bs=16k count=64000
64000+0 records in
64000+0 records out
1048576000 bytes (1.0 GB) copied, 4.42826 seconds, 237 MB/s
-----------------------------------------------------------------------------------------------

For very small block-sized writes write-behind does not seem to help? How
to improve small write
caching?
Al

On Mon, Jun 4, 2012 at 11:13 AM, Raghavendra G <raghavendra at gluster.com>wrote:

> Hi,
>
> The purpose of performance translators is to decrease system call latency
> of applications and increase responsiveness of glusterfs.
>
> The standard approach used within glusterfs to decrease system call
> latency is making sure we avoid network roundtrip time as part of the fop
> processing. And based on what fop we are dealing with, we have different
> translators like read-ahead, io-cache, write-behind, quick-read, md-cache.
>
>    - Though read-ahead and io-cache both serve read calls, the difference
>    lies in that read-ahead can even serve first read on an offset (since it
>    would have read-ahead on a read with lesser offset) and io-cache can serve
>    only requests after first read on an offset from its cache. read-ahead can
>    have negative performance impact in the form of cache maintanence on random
>    reads.In case of read-ahead, cache is maintained per-fd basis and io-cache
>    maintains per-inode cache. Ceiling for cache limits can be configured.
>    - write-behind takes the responsibility of storing writes in its cache
>    and syncing it with disk in background. Because of this fact, we may not
>    able to find out the fate of a write from an application in return value of
>    that write. However write-behind communicates errors to application either
>    in return value of current or future writes or close call. Paranoid
>    applications which need to know errors during any writes previously done,
>    should do an fsync. There is another option flush-behind which when turned
>    on, makes flush calls sent as part of close, background. The consequence of
>    doing flush in background is that posix locks on that fd might not be
>    cleared as soon as close returns.
>    - quick-read optimizes reads by storing small files in its cache. It
>    gets the contents of entire file as part of lookup call done during path to
>    inode conversion. It assumes that all opens are done with an intention of
>    doing reads and hence doesn't really send open to translators below if the
>    file is cached. However, it maintains the abstraction by doing open as part
>    of other fd based fops (like fstat, fsync etc). Because of this,
>    read-intensive applications like a web-server serving lots of small files,
>    can save network round trip for two fops - open and read (It used to save
>    close roundtrip call too, but with close implemented as a callback of
>    fd-destroy, network roundtrip time is eliminated altogether).
>    - md-cache is a translator that caches metadata like stats, certain
>    extended attributes of files.
>
> One of the strategies to increase responsiveness is to introduce
> asynchronous nature - one doesn't block on a single operation to complete
> before taking another - during fop processing. Again asynchronous nature
> can be achieved using single or multiple threads. The first approach is
> effective only when there are blocking components in the system, like I/O
> with network or disk. Performance translators does not do anything helpful
> in this aspect (STACK_WIND and STACK_UNWIND macros, non-blocking sockets
> etc help here). It is in introducing parallel processing as call proceeds
> through gluster translator graph where io-threads (a performance
> translator) comes into picture. Apart from introducing parallelism,
> io-threads implements priority based processing of fops, which helps to
> increase responsiveness. There are other threads within a glusterfs process
> which are not maintained by io-threads like fuse-reader, posix janitor, a
> thread which polls on network sockets, threads processing send/receive
> completion queues in infiniband, threads introduced by syncops, thread
> processing timer events etc.
>
> regards,
> --
> Raghavendra G
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120712/894ab2ed/attachment.html>