[Gluster-users] glusterfsd process thrashing CPU

Tue Nov 18 09:06:19 UTC 2014

On 11/18/2014 01:17 PM, Lindsay Mathieson wrote:
> On 18 November 2014 17:40, Pranith Kumar Karampuri <pkarampu at redhat.com> wrote:
>> Sorry didn't see this one. I think this is happening because of 'diff' based
>> self-heal which does full file checksums, that I believe is the root cause.
>> Could you execute 'gluster volume set <volname>
>> cluster.data-self-heal-algorithm full' to prevent this issue in future. But
>> this option will be effective for the new self-heals that will be triggered
>> after the execution of the command. The ongoing ones will still use the old
>> mode of self-heal.
> Thanks, makes sense.
>
> However given the files are tens of GB in size, won't it thrash my network?
Yes you are right. I wonder why thrashing of the network is never 
reported till now.
+Joejulian who also uses VMs on gluster(for 5 years now?). He uses this 
option of full self-heal (Thats what I saw in his bug reports).

I still need to think about how best to solve this problem.

Let me tell you a bit more about this issue:
there are two processes which heal the VM images:
1) self-heal-daemon. 2) Mount process.
Self-heal daemon heals one VM image at a time. But mount process 
triggers self-heals for all the opened files(VM image is nothing but an 
opened file from filesystem's perspective) when a brick goes down and 
comes backup. So we need to come up with a scheme to throttle self-heals 
on the mount point to prevent this issue. I will update you as soon as I 
come up with a fix. This should not be hard to do. Need some time to 
choose the best approach. Thanks a lot for bringing up this issue.

Pranith
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users