[Gluster-users] Very poor heal behaviour in 3.7.9
Lindsay Mathieson
lindsay.mathieson at gmail.com
Sat Mar 26 01:25:38 UTC 2016
On 26/03/2016 12:14 AM, Ravishankar N wrote:
> I think you need the exact no. of files and size of files that need
> healing to make any meaningful comparison of self-heal performance
> across versions.
> VM workloads with sharding might not be the ideal 'reproducer' since
> you really don't know how many shards get modified when a replica is
> down and I/O on the VMs happen. I suppose you could try testing the
> heal performance of a specific no. of files on a sharded volume and
> compare results.
Maybe my subject description was poor - while heal progress is not the
best, its the I/O stalls that *really* concern me. If I reboot a node
(or it crashes etc) any VM that is running on the cluster when that
happened freezes on I/O access when heal kicks in until it finishes,
which will take over an hour.
I see similar behaviour noted in the "GlusterFS cluster stalls if one
server from the cluster goes down and then comes back up".
I tried setting "cluster.data-self-heal" off as suggested on that thread
and it seems to have improved things. In the middle of maintenance right
now and will test it more later.
thanks,
--
Lindsay Mathieson
More information about the Gluster-users
mailing list