[Gluster-devel] md-cache changes and impact on tiering

Tue Sep 6 20:31:59 UTC 2016

----- Original Message -----
> From: "Dan Lambright" <dlambrig at redhat.com>
> To: "Poornima Gurusiddaiah" <pgurusid at redhat.com>
> Cc: "Nithya Balachandran" <nbalacha at redhat.com>, "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Sunday, August 28, 2016 10:01:36 AM
> Subject: Re: md-cache changes and impact on tiering
> 
> 
> 
> ----- Original Message -----
> > From: "Poornima Gurusiddaiah" <pgurusid at redhat.com>
> > To: "Dan Lambright" <dlambrig at redhat.com>, "Nithya Balachandran"
> > <nbalacha at redhat.com>
> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > Sent: Tuesday, August 23, 2016 12:56:38 AM
> > Subject: md-cache changes and impact on tiering
> > 
> > Hi,
> > 
> > The basic patches for md-cache and integrating it with cache-invalidation
> > is
> > merged in master. You could try master build and enable the following
> > settings, to see if there is any impact on tiering performance at all:
> > 
> > # gluster volume set <volname> performance.stat-prefetch on
> > # gluster volume set <volname> features.cache-invalidation on
> > # gluster volume set <volname> performance.cache-samba-metadata on
> > # gluster volume set <volname> performance.md-cache-timeout 600
> > # gluster volume set <volname> features.cache-invalidation-timeout 600
> 
> On the tests I run, this cut the number of LOOKUPs by about three orders of
> magnitude. Each saved lookup reduces a round trip over the network.
> 
> I'm running a "small file" performance test. It creates 16K 64 byte files in
> a seven level directory. It then reads each file twice.
> 
> Configuration is HOT: 2 x 2 ramdisk COLD: 2 x (8 + 4) disk, network is
> 10000Mb/s 9000 mtu. The number of lookups is a factor of the number of
> directories and subvolumes. On each I/O the file is re-opened and each
> directory is laboriously rechecked for existence/permission.
> 
> Without using md-cache, these lookups used to be further propagated across
> each subvolume by DHT to obtain the entire layout. So it would be something
> like order of 16K*7*26 round trips across the network.
> 
> The counts are all visible with gluster profile.

I'm going to have to retract the above comments. The optimization does not work well for me yet. 

If I follow the traces, something odd happens when the client sends a LOOKUP. The server will send an invalidation, from the upcall translator's lookup fop callback. At that point any future LOOKUPs for that entry are passed right through again to the server.. this logic defeats the reasoning for using md-cache.. can you explain the reasoning behind that?

> 
> 
> > 
> > Note: It has to be executed in the same order.
> > 
> > Tracker bug: https://bugzilla.redhat.com/show_bug.cgi?id=1211863
> > Patches:
> > http://review.gluster.org/#/q/status:open+project:glusterfs+branch:master+topic:bug-1211863
> > 
> > Thanks,
> > Poornima
> > 
>