Pranith Kumar Karampuri
Sat Nov 22 17:10:08 UTC 2014

On 11/22/2014 10:29 PM, Kyle Harris wrote:
> Hello,
> I have an issue with a 3 node replicated cluster.  My issue started 
> after reboot a while back.  The top command would show the glusterfs 
> and glusterfsd processes eating up almost all the resources on an all 
> three nodes of the cluster.  So much so that it would not run the web 
> sites that are hosted on it. The httpd processes would begin to hang.  
> I finally decided to tear down the cluster and rebuild it from the 
> ground up.  I did so and then copied all the data back which took all 
> night due to the amount of data.  All was well during that entire copy 
> process back to the cluster with no resource spikes.
> I should note that this cluster is home to many Apache/PHP based web 
> sites.  The problem starts again, however the minute I point traffic 
> back to the sites on the cluster.  Before pointing traffic to it, all 
> is fine but as soon as the traffic begins to hit it, the utilization 
> again begins to spike.  Note that all the sites run just fine when 
> hosted from a standard EXT4 partition.  I noticed another thread 
> labeled "glusterfsd process thrashing CPU" where Pranith asks if the 
> user has directories with lots of files and I do.
> Here are some other details of my cluster:
> - OS:  CentOS 6.6 with all updates on all 3 nodes as of 11-22-2014
> - All 3 nodes have 8 cores with 16 GB of RAM
> - Nodes are all formatted with EXT4
> - All three nodes also have the files systems mounted on them for use 
> with Apache.  I have experimented with both NFS and Fuse mounts and it 
> doesn't seem to make a difference which I use for this particular 
> problem.  I am currently using Fuse.
> - Approximately 135 GB of data.  Some deep directories with many small 
> files.
> - No optimization or changes have been made to the cluster . . . it is 
> running with default options
> - Gluster version 3.6.1-1 installed from RPMs
> - Note the issue originally occurred on version 3.5.2 but I updated 
> before rebuilding it in hopes that would fix it (it didn't)
> Can anyone give me guidance on how to tackle this problem? I am hoping 
> perhaps Pranith can give some details as to why the question about 
> many files and how to proceed given my situation.  I know others have 
> commented about having many small files with regard to performance but 
> when the processors are not spiked, performance has been acceptable.  
> Any help would be greatly appreciated.
       3.6.1 and EXT4 has a problem because of 64 bits offset. Afr-v2 
implementation introduced this problem. We thought the following patch 
is merged but it didn't :-( http://review.gluster.com/8201. Please don't 
use 3.6.1 with EXT4

       Please merge http://review.gluster.com/8201

