[Gluster-devel] lookup caching

Olivier Le Cam Olivier.LeCam at crdp.ac-versailles.fr
Sun Apr 4 08:53:19 UTC 2010


Raghavendra G wrote:
> 
> On Fri, Apr 2, 2010 at 3:32 PM, Olivier Le Cam 
> <Olivier.LeCam at crdp.ac-versailles.fr 
> <mailto:Olivier.LeCam at crdp.ac-versailles.fr>> wrote:
> 
>     Hi -
> 
>     I am evaluating glusterfs for a replacement of an NFS server which
>     acts as a backend storage for a webcluster, in order to take
>     advantage of its very interesting features in term of
>     high-availability and scalability.
> 
>     That said, I'm experiencing (like everybody in the same situation)
>     performance issues due to the large number of small files a
>     webserver have to deal with. 
> 
> 
>     The io-cache translator does not help so much in this situation
>     because (as far as I understood) the clients always have to check
>     the mtime of the target file before delivering it in order to known
>     if the cache is up-to-date. This intensive network traffic is quite
>     penalizing in term of performance (especially on a Gb-E).
> 
> 
>     Following to a recent talk on the IRC channel, it came to my mind
>     that caching lookups could (in this particular situation) greatly
>     improve the performances.
> 
> 
> If you are not very much concerned about file being changed from other 
> clients while it is being cached, you can set 'cache-timeout' value in 
> io-cache configuration to some high value, there by increasing the time 
> intervals at which stat call is sent to server to check whether the file 
> has changed.
>  
> 
> 
>     I have observed the GlusterFS code carefully and TBH I haven't been
>     able to see how/where such a translator could be integrated in.
> 
>     Would it be possible to get some help? Are other users/developers
>     already involved in such a development?
> 
> 
> If you are just interested in caching stats for beniefit of io-cache, 
> the same functionality can be achieved by tuning cache-timeout value in 
> io-cache.

This is not the behaviour I would expected according to the doc: "If the 
cached page for a file is greater than 'cache-timeout' seconds old, 
io-cache translator forces a re-validation of the page. However the 
cached page is verified against the mtime whenever possible and cache is 
refreshed. Default is 1 second."

AFAIU, mtime is always verified (whenever possible). As a non native 
english spoker I might misunderstood something thougth!

Anyway, I have tested with a cache-timeout of 60 seconds on a client: 
according to the debug traces and tcpdump, cat'ing the same file several 
times. I have no idea which protocols are involved here but there is 
always some traffic between the client and the servers even where cat 
issued within the cache interval.

My guess is that file is indeed cached by io-cache but that the client 
always stats the server before delivering the file (either from the 
cache or the glusterfs).

Debug log attached.

Kind regards,
-- 
Olivier

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mnt.log
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20100404/a1ccd554/attachment-0003.ksh>


More information about the Gluster-devel mailing list