[Gluster-devel] md-cache improvements

Wed Aug 10 17:05:58 UTC 2016

There have been recurring discussions within the gluster community to build on existing support for md-cache and upcalls to help performance for small file workloads. In certain cases, "lookup amplification" dominates data transfers, i.e. the cumulative round trip times of multiple LOOKUPs from the client mitigates benefits from faster backend storage. 

To tackle this problem, one suggestion is to more aggressively utilize md-cache to cache inodes on the client than is currently done. The inodes would be cached until they are invalidated by the server. 

Several gluster development engineers within the DHT, NFS, and Samba teams have been involved with related efforts, which have been underway for some time now. At this juncture, comments are requested from gluster developers. 

(1) .. help call out where additional upcalls would be needed to invalidate stale client cache entries (in particular, need feedback from DHT/AFR areas), 

(2) .. identify failure cases, when we cannot trust the contents of md-cache, e.g. when an upcall may have been dropped by the network

(3) .. point out additional improvements which md-cache needs. For example, it cannot be allowed to grow unbounded.

Dan

----- Original Message -----
> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> 
> List of areas where we need invalidation notification:
> 1. Any changes to xattrs used by xlators to store metadata (like dht layout
> xattr, afr xattrs etc).
> 2. Scenarios where individual xlator feels like it needs a lookup. For
> example failed directory creation on non-hashed subvol in dht during mkdir.
> Though dht succeeds mkdir, it would be better to not cache this inode as a
> subsequent lookup will heal the directory and make things better.
> 3. removing of files
> 4. writev on brick (to invalidate read cache on client)
> 
> Other questions:
> 5. Does md-cache has cache management? like lru or an upper limit for cache.
> 6. Network disconnects and invalidating cache. When a network disconnect
> happens we need to invalidate cache for inodes present on that brick as we
> might be missing some notifications. Current approach of purging cache of
> all inodes might not be optimal as it might rollback benefits of caching.
> Also, please note that network disconnects are not rare events.
> 
> regards,
> Raghavendra