[Gluster-devel] GlusterFS cache architecture

Raghavendra Gowdappa rgowdapp at redhat.com
Tue Sep 1 03:53:46 UTC 2015



----- Original Message -----
> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> To: "Oleksandr Natalenko" <oleksandr at natalenko.name>
> Cc: gluster-devel at gluster.org
> Sent: Tuesday, September 1, 2015 9:20:04 AM
> Subject: Re: [Gluster-devel] GlusterFS cache architecture
> 
> 
> 
> ----- Original Message -----
> > From: "Oleksandr Natalenko" <oleksandr at natalenko.name>
> > To: gluster-devel at gluster.org
> > Sent: Monday, August 31, 2015 7:37:51 PM
> > Subject: [Gluster-devel] GlusterFS cache architecture
> > 
> > Hello.
> > 
> > I'm trying to investigate how GlusterFS manages cache on both server and
> > client side, but unfortunately cannot find any exhaustive, appropriate
> > and up
> > to date information.
> > 
> > The disposition is that we have, saying, 2 GlusterFS nodes (server_a and
> > server_b) with replicated volume some_volume. Also we have several
> > clients
> > (saying client_1 and client_2) that mount some_volume and do some
> > manipulation
> > with files on it (lets assume some_volume contains web-related assets,
> > and
> > client_1/client_2 are web-servers). Also there is client_3 that does
> > web-
> > related deploying on some_volume (lets assume that client_3 is
> > web-developer).
> > 
> > We would like to use multilayered cache scheme that involves filesystem
> > cache
> > (on both client/server sides) as well as web server cache.
> > 
> > So, my questions are:
> > 
> > 1) does caching-related items (performance.cache-size,
> > performance.cache-min-
> > file-size, performance.cache-max-file-size etc.) affect server side
> > only?
> 
> Actually, caching is on the client side (this caching aims to beat network
> and disk latency to add up into our fop - file operation - latency). There
> is no server side caching in glusterfs as of now (except for what ever
> caching underlying OS/drivers provide in backend).
> 
> > 2) are there any tunables that affect client side caching?
> 
> Yes. Basic tunables one need to be aware of are the ones affecting
> cache-sizes. There are some tunables which define glusterfs behaviour for
> better/lesser consistency (with a possible trade-off of performance). These
> consistency related tunables are mostly (but not limited to) in write-behind
> (like strict-ordering, flush-behind etc). There are various timeouts in each
> xlator that can be configured to tune cache-coherency. "gluster volume set
> help" should give you a starting point.

If you don't find documentation anywhere, you can look into source code of each of the xlators for a global definition of array "options" which is of type "struct volume_options" :). They also carry basic few line description of what the option is supposed to do.

> 
> > 3) how client-side caching (we are talking about read cache only, write
> > cache
> > is not interesting to us) is performed (if it is at all)?
> 
> client side read-caching is done across multiple xlators:
> 
> 1. read-ahead: to boost performance during sequential reads. We read "ahead"
> of the application, so that data can be in our read-cache by the time
> application requests it.
> 
> 2. io-cache: to boost performance if application "re-reads" same region of
> file. We cache after application has requested some data, so that subsequent
> accesses are served from io-cache.
> 
> 3. quick-read (in conjunction with open-behind): to boost reads on small
> files. Quick read caches the entire file during lookup. Any further opens
> are "faked" by open-behind, assuming that the application is doing open
> solely to read the file (which is anyways cached already). If the
> application does a different fop, then an fd is opened and fop is performed
> after successful open. Quick read aims to save time spent in open, multiple
> reads and a release over network.
> 
> 4. md-cache (or stat-prefetch): Caches metadata (like iatt - gluster
> equivalent of stat, user xattrs etc).
> 
> 5. readdir-ahead: similar to read-ahead, but for directory entries during
> readdir. This helps to boost performance of readdir.
> 
> 
> > 4) how and in what cases client cache is discarded (and how that relates
> > to
> > upcall framework)?
> 
> As of now read-cache is discarded based on the availability of free space in
> cache and timeouts (age of data in cache). Currently upcall is not used to
> address cache-coherency issues, but can be used in future.
> 
> > 
> > Ideally, there should be some documentation that covers general
> > GlusterFS
> > cache workflow.
> > 
> > Any info would be appreciated. Thanks.
> > 
> > --
> > Oleksandr post-factum Natalenko, MSc
> > pf-kernel community
> > https://natalenko.name/
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> 


More information about the Gluster-devel mailing list