[Gluster-devel] glusterfs client and page cache
Brian Foster
bfoster at redhat.com
Wed May 30 23:10:58 UTC 2012
On 05/30/2012 03:32 PM, Anand Avati wrote:
> Brian,
> You are right, today we hardly leverage the page cache in the kernel.
> When Gluster started and performance translators were implemented, the
> fuse invalidation support did not exist, and since that support was
> brought in upstream fuse we haven't leveraged that effectively. We can
> actually do a lot more smart things using the invalidation changes.
>
> For the consistency concerns where an open fd continues to refer to
> local page cache - if that is a problem, today you need to mount with
> --enable-direct-io-mode to bypass the page cache altogether (this is
> very different from O_DIRECT open() support). On the other hand, to
> utilize the fuse invalidation APIs and promote using the page cache and
> still be consistent, we need to gear up glusterfs framework by first
> implementing server originated messaging support, then build some kind
> of opportunistic locking or leases to notify glusterfs clients about
> modifications from a second client, and third implement hooks in the
> client side listener to do things like sending fuse invalidations or
> purge pages in io-cache or flush pending writes in write-behind etc.
> This needs to happen, but we're short on resources to prioritize this
> sooner :-)
>
Thanks for the context Avati. The fuse patch I sent lead to a similar
thought process with regard to finer grained invalidation. So far it
seems well received, and as I understand it, we can also utilize that
mechanism to do full invalidations from gluster on older fuse modules
that wouldn't have that fix. I'll look into incorporating that into what
I have so far and making it available for review.
Brian
> Avati
>
> On Wed, May 30, 2012 at 8:16 AM, Brian Foster <bfoster at redhat.com
> <mailto:bfoster at redhat.com>> wrote:
>
> Hi all,
>
> I've been playing with a little hack recently to add a gluster mount
> option to support FOPEN_KEEP_CACHE and I wanted to solicit some thoughts
> on whether there's value to find an intelligent way to support this
> functionality. To provide some context:
>
> Our current behavior with regard to fuse is that page cache is utilized
> by fuse, from what I can tell, just about in the same manner as a
> typical local fs. The primary difference is that by default, the address
> space mapping for an inode is completely invalidated on open. So for
> example, if process A opens and reads a file in a loop, subsequent reads
> are served from cache (bypassing fuse and gluster). If process B steps
> in and opens the same file, the cache is flushed and the next reads from
> either process are passed down through fuse. The FOPEN_KEEP_CACHE option
> simply disables this cache flash on open behavior.
>
> The following are some notes on my experimentation thus far:
>
> - With FOPEN_KEEP_CACHE, fuse currently only invalidates on file size
> changes. This is a problem in that I can rewrite some or all of a file
> from another client and the cached client wouldn't notice. I've sent a
> patch to fuse-devel to also invalidate on mtime changes (similar to
> nfsv3 or cifs), so we'll see how well that is received. fuse also
> supports a range based invalidation notification that we could take
> advantage of if necessary.
>
> - I reproduce a measurable performance benefit in the local/cached read
> situation. For example, running a kernel compile against a source tree
> in a gluster volume (no other xlators and build output to local storage)
> improves to 6 minutes from just under 8 minutes with the default graph
> (9.5 minutes with only the client xlator and 1:09 locally).
>
> - Some of the specific differences from current io-cache caching:
> - io-cache supports time based invalidation and tunables such
> as cache
> size and priority. The page cache has no such controls.
> - io-cache invalidates more frequently on various fops. It
> also looks
> like we invalidate on writes and don't take advantage of the write data
> most recently sent, whereas page cache writes are cached (errors
> notwithstanding).
> - Page cache obviously has tighter integration with the
> system (i.e.,
> drop_caches controls, more specific reporting, ability to drop cache
> when memory is needed).
>
> All in all, I'm curious what people think about enabling the cache
> behavior in gluster. We could support anything from the basic mount
> option I'm currently using (i.e., similar to attribute/dentry caching)
> to something integrated with io-cache (doing invalidations when
> necessary), or maybe even something eventually along the lines of the
> nfs weak cache consistency model where it validates the cache after
> every fop based on file attributes.
>
> In general, are there other big issues/questions that would need to be
> explored before this is useful (i.e., the size invalidation issue)? Are
> there other performance tests that should be explored? Thoughts
> appreciated. Thanks.
>
> Brian
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
More information about the Gluster-devel
mailing list