[Gluster-users] client-side cpu usage, performance issue

Wed Dec 9 06:07:24 UTC 2009

> I hear you, I had followed this article specifically in an attempt to 
> improve performance.  I suppose I'm hoping for more specifics on how 
> these values correspond to an application, a number of cpu's, a brand of 
> network card, etc.  io-threads counts, for example, only seem to drive 
> load average higher, as they all sit there chewing up cpu anyway, so you 
> lower them and get lower overall system load but higher latency.  But 
> why would the glusterfs process need cpu time anyway?
>
> John
IMHO adding new translators here is the wrong way to solve the problem,
as it could be related to high memory usage by the client, which
was reported by me. No extra caches, no io-threads, just a plain
unify-over-replicate setup. I observed also high CPU load and high latency,
but concentrated on the memory usage. Same behavior occured with plain
striping, so it seems to be setup-independent.
I would suspect aggressive caching of some data describing every file
touched by the glusterfs client. In fact, these data seem to be kept forever,
as the memory is never freed and no new allocations are made, if the same
files are accessed for the second time. It is sufficient to run ls -R or du
on a big directory tree and run top, to see the memory usage of glusterfs client
increasing nicely up to hundreds of megabytes. John, do you see it too?
Of course, cache lookups become expensive and we have high CPU load.
I tried to find it in the code, but no success :(.
Dear developers, wouldn't it be better to forget everything about a file,
which has been closed? Just tell me, where to search in the sources, if you
are overloaded with other work.

Krzysztof

BTW, glusterfs version is 2.0.8.