[Gluster-users] Gluster using 400% CPU (2 cores, 2 hyperthreaded)

Gerald Brandt gbr at majentis.com
Wed Nov 28 12:50:39 UTC 2012


I've had a few incidents over the last week where GlusterFS NFS server started using 400% CPU, and the Gluster server went to a load of 29.  I couldn't figure out the issue, but a system reset fixed it.  This server has been in production since 3.3.0 came out.

Yesterday, I may have fixed it (only time will tell).  Gluster is server an mdadm RAID-6 array formatted as XFS.  Yesterday, when the CPU spiked, I had atop running already and it was showing 2 drives in the RAID-6 array as having 50-70 ms seek times (sdc and sdd).  The other drives in the array were the regular 2-3 ms.

Removing only one drive from the RAID (sdc) brought the seek times of sdd back to normal, and Gluster recovered.

This is a little off topic for Gluster, but has anybody seen this situation before?  Am I looking at a single bad drive that brought down another drive on the same controller, or am I looking at a bad controller.  Or what?


