[Bugs] [Bug 1166063] New: diff heal makes the system unusable
bugzilla at redhat.com
bugzilla at redhat.com
Thu Nov 20 10:58:53 UTC 2014
https://bugzilla.redhat.com/show_bug.cgi?id=1166063
Bug ID: 1166063
Summary: diff heal makes the system unusable
Product: GlusterFS
Version: mainline
Component: replicate
Assignee: bugs at gluster.org
Reporter: pkarampu at redhat.com
CC: bugs at gluster.org, gluster-bugs at redhat.com
Description of problem:
Here is the mail sent by Lindsay Mathieson:
2 Node replicate setup,
Everything has been stable for days untill I had occasion to reboot
one of the nodes. Since then (past hour) glusterfsd has been pegging
the CPU(s), utilization ranging from 1% to 1000% !
On average its around 500%
This is a vm server, so there are only 27 VM images for a total of
800GB. Its an Intel E5-2620 (12 Cores) with 32GB ECC RAM
- What does glusterfsd do?
- What can I do to fix this?
thanks,
------------------------
We found that the root cause is that mount started self-heal of all the VMs
which are doing diff self-heal, i.e. checksums are consuming high CPU on the
bricks which lead to the issue. We need a way to throttle the number of
parallel self-heals.
Version-Release number of selected component (if applicable):
How reproducible:
always
Steps to Reproduce:
1. Have a lot of VMs on replicated volume
2. Bring one brick down and do some write activity on all the VMs
3. Bring the brick back up while the VM operations are in progress
4. This will lead to self-heal of all the VMs by the mount.
5. That will cause high CPU usage on bricks because of checksums.
Expected results:
Bricks should not use so much CPU. There should be some kind of throttling
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list