[Gluster-users] self-heal stops some vms (virtual machines)

Fri Feb 7 12:13:59 UTC 2014

hello all

I have a replicate volume that holds kvm  vms (virtual machines)

I had to stop one gluster-server for maintenance . That part of the 
operation went well: no vms problems after shutdown

the problems started after booting the gluster-server. Self-healing 
started as expected, but some vms  locked up with disk problems 
(time-outs), as self-healing goes by them.
Some VMs did survive the self-healing . I suppose the ones with low IO 
activity or less sensitive to disk problems

is there some specific gluster configuration to enable a self-healing 
ride-through on running-vms? (cluster.data-self-heal-algorithm is 
already on the diff mode)

is there some tweaks recommended to do on vms running on top of gluster?

current config:

gluster:   3.3.0-1.el6.x86_64

--------------------- volume:
# gluster volume info VOL

Volume Name: VOL
Type: Distributed-Replicate
Volume ID: f44182d9-24eb-4953-9cdd-71464f9517e0
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: one-gluster01:/san02-v2
Brick2: one-gluster02:/san02-v2
Brick3: one-gluster01:/san03
Brick4: one-gluster02:/san04
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
nfs.disable: on
auth.allow:x
performance.flush-behind: off
cluster.self-heal-window-size: 1
performance.cache-size: 67108864
cluster.data-self-heal-algorithm: diff
performance.io-thread-count: 32
cluster.min-free-disk: 250GB

thanks,
best regards,
joao