[Gluster-users] glusterfs and glusterfsd process utilization extremely high

Prashanth Pai ppai at redhat.com
Fri Dec 19 04:24:58 UTC 2014


Reminded me of this, FYI:
https://github.com/jdarcy/negative-lookup


Regards,
 -Prashanth Pai

----- Original Message -----
From: "Kyle Harris" <kyle.harris98 at gmail.com>
To: "Krutika Dhananjay" <kdhananj at redhat.com>
Cc: gluster-users at gluster.org
Sent: Thursday, December 18, 2014 7:09:31 PM
Subject: Re: [Gluster-users] glusterfs and glusterfsd process utilization extremely high

Hi Krutika and thank you for the quick response. I think I found the problem and it was hiding in the logs the whole time. However, I'm still glad I started this thread as it might help someone else and furthermore I still have a question about it. 

I discovered a lot of entries similar to the following in the gluster mnt log: 


12-18 02:41:23.557523] I [dht-common.c:1822:dht_lookup_cbk] 0-gv0-dht: Entry /html/some_site/some_folder/asdf.php missing on subvol gv0-replicate-0 




Because this log entry appeared to just be informational, I didn't pay much attention to it. However I began to notice many of them for one particular site that is hosted on this cluster. I finally decided to remove that site temporarily from the cluster and much to my surprise AND delight, the problem went away! 




After much research, it appears as though files that are called from a gluster drive that are not present is an expensive operation in terms of resource utilization and that was causing my problem. Obviously the solution is to have the developers fix the issues on the site but it does bring up another question. 




What happens when I have a site hosted on a gluster drive and a user or link points to an incorrect URL on that site and thus to a file that doesn't exist? Obviously that would have to happen multiple times in order to be a problem but on a busy site, the potential exist for a denial of service. 




So my new question is this. How can this be mitigated from gluster such that missing files do not cause such an issue? 




Thank you again for any assistance. 




Kyle 

On Wed, Dec 17, 2014 at 9:51 PM, Krutika Dhananjay < kdhananj at redhat.com > wrote: 







From: "Kyle Harris" < kyle.harris98 at gmail.com > 
To: gluster-users at gluster.org 
Sent: Thursday, December 18, 2014 4:47:35 AM 
Subject: [Gluster-users] glusterfs and glusterfsd process utilization extremely high 

This is an extenuation of a problem that I posted about last month that I am still experiencing. The original post with more detail can be found at http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019587.html . To sum up my problem, I have a freshly created 3 node replicated cluster. It contains roughly 135 GB of files, many of which are small. It is home to several web sites hosted with Apache. I am using Gluster version 3.6.1-1 installed from RPMs mounted from the server via the Fuse client (I have tried NFS but it makes no difference). 

When I posted last about the problem of extreme processor utilization, the solution I was given by Parnith was to utilize another file system other than EXT4 and to turn off cluster.entry-self-heal. I am now using XFS and cluster.entry-self-heal is turned off and I even turned off cluster.self-heal-daemon but it made absolutely no difference. All is fine during the entire time the cluster is loaded via rsync however the minute I point Apache traffic at the sites hosted on the cluster, glusterfs and glusterfsd begin to climb to levels so high that in a matter of minutes it is not even possible to log on to the system. No modification have been made to any of the other Gluster settings. 

Any additional help resolving this matter would be greatly appreciated. 
Hello, 

First of all, do the logs suggest anything useful? 

Could you perform the following steps while the I/O is going on (this is assuming the nodes are not thrashed to the extent that it is impossible to execute these commands): 

1) On the shell, on one of the nodes in the cluster, execute `gluster volume profile <volname> start` 
Wait for a minute or two. And then execute `gluster volume profile <volname> info` and collect its output. 
Wait for another minute or so. And execute `gluster volume profile <volname> info` and collect its output too, and share them? 
You can stop the profiling once you are done using `gluster volume profile <volname> stop`. 
2) Assuming it is the brick processes (glusterfsd) that are showing high CPU utilisation, is it possible to get the core of the processes when this is happening? 

-Krutika 




-- 
Regards, 

Kyle 


_______________________________________________ 
Gluster-users mailing list 
Gluster-users at gluster.org 
http://supercolony.gluster.org/mailman/listinfo/gluster-users 



-- 
Kyle A. Harris 
Kyle at TheHarrisHome.com 
615-364-6752 


_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list