[Gluster-users] Very poor heal behaviour in 3.7.9

Lindsay Mathieson lindsay.mathieson at gmail.com
Sat Mar 26 01:25:38 UTC 2016

On 26/03/2016 12:14 AM, Ravishankar N wrote:
> I think you need the exact no. of files and size of files that need 
> healing to make any meaningful comparison of self-heal performance 
> across versions.
> VM workloads with sharding might not be the ideal 'reproducer' since 
> you really don't know how many shards get modified when a replica is 
> down and I/O on the VMs happen. I suppose you could try testing the 
> heal performance of a specific no. of files on a sharded volume and 
> compare results.

Maybe my subject description was poor - while heal progress is not the 
best, its the I/O stalls that *really* concern me. If I reboot a node 
(or it crashes etc) any VM that is running on the cluster when that 
happened freezes on I/O access when heal kicks in until it finishes, 
which will take over an hour.

I see similar behaviour noted in the "GlusterFS cluster stalls if one 
server from the cluster goes down and then comes back up".

I tried setting "cluster.data-self-heal" off as suggested on that thread 
and it seems to have improved things. In the middle of maintenance right 
now and will test it more later.


Lindsay Mathieson

More information about the Gluster-users mailing list