[Gluster-devel] glusterfs client and page cache

Wed May 30 19:32:50 UTC 2012

Brian,
  You are right, today we hardly leverage the page cache in the kernel.
When Gluster started and performance translators were implemented, the fuse
invalidation support did not exist, and since that support was brought in
upstream fuse we haven't leveraged that effectively. We can actually do a
lot more smart things using the invalidation changes.

For the consistency concerns where an open fd continues to refer to local
page cache - if that is a problem, today you need to mount with
--enable-direct-io-mode to bypass the page cache altogether (this is very
different from O_DIRECT open() support). On the other hand, to utilize the
fuse invalidation APIs and promote using the page cache and still be
consistent, we need to gear up glusterfs framework by first implementing
server originated messaging support, then build some kind of opportunistic
locking or leases to notify glusterfs clients about modifications from a
second client, and third implement hooks in the client side listener to do
things like sending fuse invalidations or purge pages in io-cache or flush
pending writes in write-behind etc. This needs to happen, but we're short
on resources to prioritize this sooner :-)

Avati

On Wed, May 30, 2012 at 8:16 AM, Brian Foster <bfoster at redhat.com> wrote:

> Hi all,
>
> I've been playing with a little hack recently to add a gluster mount
> option to support FOPEN_KEEP_CACHE and I wanted to solicit some thoughts
> on whether there's value to find an intelligent way to support this
> functionality. To provide some context:
>
> Our current behavior with regard to fuse is that page cache is utilized
> by fuse, from what I can tell, just about in the same manner as a
> typical local fs. The primary difference is that by default, the address
> space mapping for an inode is completely invalidated on open. So for
> example, if process A opens and reads a file in a loop, subsequent reads
> are served from cache (bypassing fuse and gluster). If process B steps
> in and opens the same file, the cache is flushed and the next reads from
> either process are passed down through fuse. The FOPEN_KEEP_CACHE option
> simply disables this cache flash on open behavior.
>
> The following are some notes on my experimentation thus far:
>
> - With FOPEN_KEEP_CACHE, fuse currently only invalidates on file size
> changes. This is a problem in that I can rewrite some or all of a file
> from another client and the cached client wouldn't notice. I've sent a
> patch to fuse-devel to also invalidate on mtime changes (similar to
> nfsv3 or cifs), so we'll see how well that is received. fuse also
> supports a range based invalidation notification that we could take
> advantage of if necessary.
>
> - I reproduce a measurable performance benefit in the local/cached read
> situation. For example, running a kernel compile against a source tree
> in a gluster volume (no other xlators and build output to local storage)
> improves to 6 minutes from just under 8 minutes with the default graph
> (9.5 minutes with only the client xlator and 1:09 locally).
>
> - Some of the specific differences from current io-cache caching:
>        - io-cache supports time based invalidation and tunables such   as
> cache
> size and priority. The page cache has no such controls.
>        - io-cache invalidates more frequently on various fops. It also
> looks
> like we invalidate on writes and don't take advantage of the write data
> most recently sent, whereas page cache writes are cached (errors
> notwithstanding).
>        - Page cache obviously has tighter integration with the system
> (i.e.,
> drop_caches controls, more specific reporting, ability to drop cache
> when memory is needed).
>
> All in all, I'm curious what people think about enabling the cache
> behavior in gluster. We could support anything from the basic mount
> option I'm currently using (i.e., similar to attribute/dentry caching)
> to something integrated with io-cache (doing invalidations when
> necessary), or maybe even something eventually along the lines of the
> nfs weak cache consistency model where it validates the cache after
> every fop based on file attributes.
>
> In general, are there other big issues/questions that would need to be
> explored before this is useful (i.e., the size invalidation issue)? Are
> there other performance tests that should be explored? Thoughts
> appreciated. Thanks.
>
> Brian
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20120530/379347e8/attachment-0003.html>