[Gluster-users] GlusterFS 3.0.2 small file read performance benchmark

Sat Feb 27 18:09:39 UTC 2010

After reading the mail again I'm under the impression that I didn't make
it clear enough: We don't have a pure read-only, but mostly read-only
workload. This is the reason why we've tried GlusterFS with AFR, so we
can have a multi-master read/write filesystem with a persitent copy on
each node. If we wouldn't need write access every here and then, we
could have gone with plain copies of the data.

Now another idea is the following, based on the fact that the local ext4
filesystem + VFS cache is *much* faster:

> GlusterFS with populated IO-Cache:
> real    0m38.576s
> user    0m3.356s
> sys     0m6.076s

# Work directly on the back-end (this is read-only...)
$ cd /mnt/brick/test/glusterfs/data

# Ext4 without VFS Cache:
$ echo 3 > /proc/sys/vm/drop_caches
$ for ((i=0;i<100;i++)); do tar cf - . > /dev/null & done; time wait
real	0m1.598s
user	0m2.136s
sys	0m3.696s

# Ext4 with VFS Cache:
$ for ((i=0;i<100;i++)); do tar cf - . > /dev/null & done; time wait
real	0m1.312s
user	0m2.264s
sys	0m3.256s

So the idea now is to bind-mount the backend filesystem *read-only* and
use it for all read operations. For all write operations, use the
GlusterFS mountpoint which provides locking etc. (This implies some sort
of Read/Write splitting, but we can do that...)

The downside is that the backend read operations won't make use of the
GlusterFS on-demand self-healing. But since 99% of our read-only files
are "write once, read a lot of times..." -- this could work out. After a
node failure, a simple "ls -lR" should self-heal everything and the
backend is fine too. The chance to read a broken file is very low?

Any comments on this idea? Is there something else that could go wrong
by using the backend in a pure read-only fashion that I've missed?

Any ideas why the GlusterFS performance/io-cache translator with a
cache-timeout of 60 is still so slow? Is there any way to *really* cache
meta and filedata on GlusterFS _without_ hitting the network and thus
getting very poor small file performance introduced by network latency?

Are there any plans to implement support for FS-Cache [1] (CacheFS,
Cachefiles), shipped with recent Linux kernels? Or to improve io-cache
likewise?

[1] http://people.redhat.com/steved/fscache/docs/FS-Cache.pdf

Lots of questions... :)

Best regards,
John