[Gluster-users] Healing Delays
lindsay.mathieson at gmail.com
Sun Oct 2 00:19:55 UTC 2016
On 2/10/2016 12:48 AM, Lindsay Mathieson wrote:
> Only the heal count does not change, it just does not seem to start.
> It can take hours before it shifts, but once it does, its quite rapid.
> Node 1 has restarted and the heal count has been static at 511 shards
> for 45 minutes now. Nodes 1 & 2 have low CPU load, node 3 has
> glusterfsd pegged at 800% CPU.
Ok, had a try at systematically reproducing it this morning and was
actually unable to do so - quite weird. Testing was the same as last
night - move all the VM's off a server and reboot it, wait for the
healing to finish. This time I tried it with various different settings.
Shards / Min: 350 / 8
Shards / Min: 391 / 10
heal command issued
Shards / Min: 358 / 11
heal full command issued
Shards / Min: 358 / 27
Best results were with cluster.granular-entry-heal=yes,
cluster.locking-scheme=granular but they were all quite good.
Don't know why it was so much worse last night - i/o load, cpu and
memory were the same. However one thin that is different which I can't
easily reproduce was that the cluster had been running for several
weeks, but last night I rebooted all nodes. Could gluster be developing
an issue after running for some time?
More information about the Gluster-users