[Gluster-users] Very poor heal behaviour in 3.7.9
lindsay.mathieson at gmail.com
Sat Mar 26 01:25:38 UTC 2016
On 26/03/2016 12:14 AM, Ravishankar N wrote:
> I think you need the exact no. of files and size of files that need
> healing to make any meaningful comparison of self-heal performance
> across versions.
> VM workloads with sharding might not be the ideal 'reproducer' since
> you really don't know how many shards get modified when a replica is
> down and I/O on the VMs happen. I suppose you could try testing the
> heal performance of a specific no. of files on a sharded volume and
> compare results.
Maybe my subject description was poor - while heal progress is not the
best, its the I/O stalls that *really* concern me. If I reboot a node
(or it crashes etc) any VM that is running on the cluster when that
happened freezes on I/O access when heal kicks in until it finishes,
which will take over an hour.
I see similar behaviour noted in the "GlusterFS cluster stalls if one
server from the cluster goes down and then comes back up".
I tried setting "cluster.data-self-heal" off as suggested on that thread
and it seems to have improved things. In the middle of maintenance right
now and will test it more later.
More information about the Gluster-users