[Gluster-users] Gluster 3.6.3 performance.cache-size not working as expected in some cases

Wed Sep 2 08:46:04 UTC 2015

Hello,

What I could say from my few knowledge on gluster:

   - If each server have itself as a mount point (name of server in fstab),
   then it should only ask others for metadata but grab file locally from
   itself. Use backupvolfile-server to provide an alternate one in case of
   issue at boot
   - Did you disable atime flag ? Maybe the file get this attribute updated
   on each read that may invalidate cache (just a clue)
   - if a lot small files and good server perf, you can increase number of
   thread through  performance.io-thread-count

You also have these settings that impact what get or not in the cache:

performance.*cache*-max-file-size         0

performance.*cache*-min-file-size         0

just my 2cents

Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-09-01 21:15 GMT+02:00 Christian Rice <crice at pandora.com>:

> This is still an issue for me, I don’t need anyone to tear the code apart,
> but I’d be grateful if someone would even chime in and say “yeah, we’ve
> seen that too."
>
> From: Christian Rice <crice at pandora.com>
> Date: Sunday, August 30, 2015 at 11:18 PM
> To: "gluster-users at gluster.org" <gluster-users at gluster.org>
> Subject: [Gluster-users] Gluster 3.6.3 performance.cache-size not working
> as expected in some cases
>
> I am confused about my caching problem.  I’ll try to keep this as
> straightforward as possible and include the basic details...
>
> I have a sixteen node distributed volume, one brick per node, XFS
> isize=512, Debian 7/Wheezy, 32GB RAM minimally.  Every brick node is also a
> gluster client, and also importantly an HTTP server.  We use a back-end
> 1GbE network for gluster traffic (eth1).  There are a couple dozen gluster
> client-only systems accessing this volume, as well.
>
> We had a really hot spot on one brick due to an oft-requested file, and
> every time any httpd process on any gluster client was asked to deliver the
> file, it was physically fetching it (we could see this traffic using, say,
> ‘iftop -i eth1’,) so we thought to increase the volume cache timeout and
> cache size.  We set the following values for testing:
>
> performance.cache-size 16GB
> performance.cache-refresh-timeout: 30
>
> This test was run from a node that didn’t have the requested file on the
> local brick:
>
> while(true); do cat /path/to/file > /dev/null; done
>
> and what had been very high traffic on the gluster backend network,
> delivering the data repeatedly to my requesting node, dropped to nothing
> visible.
>
> I thought good, problem fixed.  Caching works.  My colleague had run a
> test early on to show this perf issue, so he ran it again to sign off.
>
> His testing used curl, because all the real front end traffic is HTTP, and
> all the gluster nodes are web servers, which are of course using the fuse
> mount to access the document root.  Even with our performance tuning, the
> traffic on the gluster backend subnet was continuous and undiminished.  I
> saw no evidence of cache (again using ‘iftop -i eth1’, which showed a
> steady 75+% of line rate on a 1GbE link.
>
> Does that make sense at all?  We had theorized that we wouldn’t get to use
> VFS/kernel page cache on any node except maybe the one which held the data
> in the local brick.  That’s what drove us to setting the gluster
> performance cache.  But it doesn’t seem to come into play with http access.
>
>
> Volume info:
> Volume Name: DOCROOT
> Type: Distribute
> Volume ID: 3aecd277-4d26-44cd-879d-cffbb1fec6ba
> Status: Started
> Number of Bricks: 16
> Transport-type: tcp
> Bricks:
> <snipped list of bricks>
> Options Reconfigured:
> performance.cache-refresh-timeout: 30
> performance.cache-size: 16GB
>
> The net result of being overwhelmed by a hot spot is all the gluster
> client nodes lose access to the gluster volume—it becomes so busy it
> hangs.  When the traffic goes away (failing health checks by load balancers
> causes requests to be redirected elsewhere), the volume eventually
> unfreezes and life goes on.
>
> I wish I could type ALL that into a google query and get a lucid answer :)
>
> Regards,
> Christian
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150902/6154c42f/attachment.html>