[Gluster-devel] glusterfs client and page cache

Wed May 30 15:16:16 UTC 2012

Hi all,

I've been playing with a little hack recently to add a gluster mount
option to support FOPEN_KEEP_CACHE and I wanted to solicit some thoughts
on whether there's value to find an intelligent way to support this
functionality. To provide some context:

Our current behavior with regard to fuse is that page cache is utilized
by fuse, from what I can tell, just about in the same manner as a
typical local fs. The primary difference is that by default, the address
space mapping for an inode is completely invalidated on open. So for
example, if process A opens and reads a file in a loop, subsequent reads
are served from cache (bypassing fuse and gluster). If process B steps
in and opens the same file, the cache is flushed and the next reads from
either process are passed down through fuse. The FOPEN_KEEP_CACHE option
simply disables this cache flash on open behavior.

The following are some notes on my experimentation thus far:

- With FOPEN_KEEP_CACHE, fuse currently only invalidates on file size
changes. This is a problem in that I can rewrite some or all of a file
from another client and the cached client wouldn't notice. I've sent a
patch to fuse-devel to also invalidate on mtime changes (similar to
nfsv3 or cifs), so we'll see how well that is received. fuse also
supports a range based invalidation notification that we could take
advantage of if necessary.

- I reproduce a measurable performance benefit in the local/cached read
situation. For example, running a kernel compile against a source tree
in a gluster volume (no other xlators and build output to local storage)
improves to 6 minutes from just under 8 minutes with the default graph
(9.5 minutes with only the client xlator and 1:09 locally).

- Some of the specific differences from current io-cache caching:
	- io-cache supports time based invalidation and tunables such 	as cache
size and priority. The page cache has no such controls.
	- io-cache invalidates more frequently on various fops. It also looks
like we invalidate on writes and don't take advantage of the write data
most recently sent, whereas page cache writes are cached (errors
notwithstanding).
	- Page cache obviously has tighter integration with the system (i.e.,
drop_caches controls, more specific reporting, ability to drop cache
when memory is needed).

All in all, I'm curious what people think about enabling the cache
behavior in gluster. We could support anything from the basic mount
option I'm currently using (i.e., similar to attribute/dentry caching)
to something integrated with io-cache (doing invalidations when
necessary), or maybe even something eventually along the lines of the
nfs weak cache consistency model where it validates the cache after
every fop based on file attributes.

In general, are there other big issues/questions that would need to be
explored before this is useful (i.e., the size invalidation issue)? Are
there other performance tests that should be explored? Thoughts
appreciated. Thanks.

Brian