[Gluster-users] Gluster 3.6.3 performance.cache-size not working as expected in some cases

Wed Sep 2 07:15:37 UTC 2015

Hi Christian,

I have been working on it since couple of days. I have not been able to 
recreate the issue. I will continue to recreate and get back to you in a 
day or two.

Regards,
Raghavendra Bhat

On 09/02/2015 12:45 AM, Christian Rice wrote:
> This is still an issue for me, I don’t need anyone to tear the code 
> apart, but I’d be grateful if someone would even chime in and say 
> “yeah, we’ve seen that too."
>
> From: Christian Rice <crice at pandora.com <mailto:crice at pandora.com>>
> Date: Sunday, August 30, 2015 at 11:18 PM
> To: "gluster-users at gluster.org <mailto:gluster-users at gluster.org>" 
> <gluster-users at gluster.org <mailto:gluster-users at gluster.org>>
> Subject: [Gluster-users] Gluster 3.6.3 performance.cache-size not 
> working as expected in some cases
>
> I am confused about my caching problem.  I’ll try to keep this as 
> straightforward as possible and include the basic details...
>
> I have a sixteen node distributed volume, one brick per node, XFS 
> isize=512, Debian 7/Wheezy, 32GB RAM minimally.  Every brick node is 
> also a gluster client, and also importantly an HTTP server.  We use a 
> back-end 1GbE network for gluster traffic (eth1).  There are a couple 
> dozen gluster client-only systems accessing this volume, as well.
>
> We had a really hot spot on one brick due to an oft-requested file, 
> and every time any httpd process on any gluster client was asked to 
> deliver the file, it was physically fetching it (we could see this 
> traffic using, say, ‘iftop -i eth1’,) so we thought to increase the 
> volume cache timeout and cache size.  We set the following values for 
> testing:
>
> performance.cache-size 16GB
> performance.cache-refresh-timeout: 30
>
> This test was run from a node that didn’t have the requested file on 
> the local brick:
>
> while(true); do cat /path/to/file > /dev/null; done
>
> and what had been very high traffic on the gluster backend network, 
> delivering the data repeatedly to my requesting node, dropped to 
> nothing visible.
>
> I thought good, problem fixed.  Caching works.  My colleague had run a 
> test early on to show this perf issue, so he ran it again to sign off.
>
> His testing used curl, because all the real front end traffic is HTTP, 
> and all the gluster nodes are web servers, which are of course using 
> the fuse mount to access the document root.  Even with our performance 
> tuning, the traffic on the gluster backend subnet was continuous and 
> undiminished.  I saw no evidence of cache (again using ‘iftop -i 
> eth1’, which showed a steady 75+% of line rate on a 1GbE link.
>
> Does that make sense at all?  We had theorized that we wouldn’t get to 
> use VFS/kernel page cache on any node except maybe the one which held 
> the data in the local brick.  That’s what drove us to setting the 
> gluster performance cache.  But it doesn’t seem to come into play with 
> http access.
>
>
> Volume info:
> Volume Name: DOCROOT
> Type: Distribute
> Volume ID: 3aecd277-4d26-44cd-879d-cffbb1fec6ba
> Status: Started
> Number of Bricks: 16
> Transport-type: tcp
> Bricks:
> <snipped list of bricks>
> Options Reconfigured:
> performance.cache-refresh-timeout: 30
> performance.cache-size: 16GB
>
> The net result of being overwhelmed by a hot spot is all the gluster 
> client nodes lose access to the gluster volume—it becomes so busy it 
> hangs.  When the traffic goes away (failing health checks by load 
> balancers causes requests to be redirected elsewhere), the volume 
> eventually unfreezes and life goes on.
>
> I wish I could type ALL that into a google query and get a lucid answer :)
>
> Regards,
> Christian
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150902/1da52d16/attachment.html>