[Gluster-users] Gluster 3.6.3 performance.cache-size not working as expected in some cases
Mathieu Chateau
mathieu.chateau at lotp.fr
Wed Sep 2 08:46:04 UTC 2015
Hello,
What I could say from my few knowledge on gluster:
- If each server have itself as a mount point (name of server in fstab),
then it should only ask others for metadata but grab file locally from
itself. Use backupvolfile-server to provide an alternate one in case of
issue at boot
- Did you disable atime flag ? Maybe the file get this attribute updated
on each read that may invalidate cache (just a clue)
- if a lot small files and good server perf, you can increase number of
thread through performance.io-thread-count
You also have these settings that impact what get or not in the cache:
performance.*cache*-max-file-size 0
performance.*cache*-min-file-size 0
just my 2cents
Cordialement,
Mathieu CHATEAU
http://www.lotp.fr
2015-09-01 21:15 GMT+02:00 Christian Rice <crice at pandora.com>:
> This is still an issue for me, I don’t need anyone to tear the code apart,
> but I’d be grateful if someone would even chime in and say “yeah, we’ve
> seen that too."
>
> From: Christian Rice <crice at pandora.com>
> Date: Sunday, August 30, 2015 at 11:18 PM
> To: "gluster-users at gluster.org" <gluster-users at gluster.org>
> Subject: [Gluster-users] Gluster 3.6.3 performance.cache-size not working
> as expected in some cases
>
> I am confused about my caching problem. I’ll try to keep this as
> straightforward as possible and include the basic details...
>
> I have a sixteen node distributed volume, one brick per node, XFS
> isize=512, Debian 7/Wheezy, 32GB RAM minimally. Every brick node is also a
> gluster client, and also importantly an HTTP server. We use a back-end
> 1GbE network for gluster traffic (eth1). There are a couple dozen gluster
> client-only systems accessing this volume, as well.
>
> We had a really hot spot on one brick due to an oft-requested file, and
> every time any httpd process on any gluster client was asked to deliver the
> file, it was physically fetching it (we could see this traffic using, say,
> ‘iftop -i eth1’,) so we thought to increase the volume cache timeout and
> cache size. We set the following values for testing:
>
> performance.cache-size 16GB
> performance.cache-refresh-timeout: 30
>
> This test was run from a node that didn’t have the requested file on the
> local brick:
>
> while(true); do cat /path/to/file > /dev/null; done
>
> and what had been very high traffic on the gluster backend network,
> delivering the data repeatedly to my requesting node, dropped to nothing
> visible.
>
> I thought good, problem fixed. Caching works. My colleague had run a
> test early on to show this perf issue, so he ran it again to sign off.
>
> His testing used curl, because all the real front end traffic is HTTP, and
> all the gluster nodes are web servers, which are of course using the fuse
> mount to access the document root. Even with our performance tuning, the
> traffic on the gluster backend subnet was continuous and undiminished. I
> saw no evidence of cache (again using ‘iftop -i eth1’, which showed a
> steady 75+% of line rate on a 1GbE link.
>
> Does that make sense at all? We had theorized that we wouldn’t get to use
> VFS/kernel page cache on any node except maybe the one which held the data
> in the local brick. That’s what drove us to setting the gluster
> performance cache. But it doesn’t seem to come into play with http access.
>
>
> Volume info:
> Volume Name: DOCROOT
> Type: Distribute
> Volume ID: 3aecd277-4d26-44cd-879d-cffbb1fec6ba
> Status: Started
> Number of Bricks: 16
> Transport-type: tcp
> Bricks:
> <snipped list of bricks>
> Options Reconfigured:
> performance.cache-refresh-timeout: 30
> performance.cache-size: 16GB
>
> The net result of being overwhelmed by a hot spot is all the gluster
> client nodes lose access to the gluster volume—it becomes so busy it
> hangs. When the traffic goes away (failing health checks by load balancers
> causes requests to be redirected elsewhere), the volume eventually
> unfreezes and life goes on.
>
> I wish I could type ALL that into a google query and get a lucid answer :)
>
> Regards,
> Christian
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150902/6154c42f/attachment.html>
More information about the Gluster-users
mailing list