[Gluster-users] replica recovery scenario

Wed Apr 2 13:20:20 UTC 2014

Dear all,
I have QEMU-1.5.3 with libgfapi on glusterfs 3.4.2
Glusterfs is replica 2 volume.

Assume I have several running VMs and one of the replica bricks goes
down. Is it possible to fully recover gluster without ever touching VMs
operation? Let me elaborate on that:

1) As I understand glusterfs client internals when VM drive backed by
gfapi connects to glusterfs volume, it creates two connections to both
replica bricks and sends i/o commands to both. Is that true?
2) When one of replica bricks goes down, what happens with VM i/o? Does
stall and wait for failing replica to timeout and then resumes operation
by sending i/o only on one brick or from viewpoint of VM there is no
significant lag?
3)a When admin repairs failed replica brick will VMs automatically
reconnect to previously failed replica brick?
3)b Will autoheal from gfapi client kick in automatically? (Or do I have
to mount volume from fuse and do stat on all files?)
3c) Will autoheal stall running VMs io? (I have heard both yes and no as
an answer and I have found not any definite answer in gluster docs)
4) If I was mistaken in some of abovementioned assumptions, what is best
practise to heal and fully recover gluster replica volume? (By "fully
recover" I mean it will be working even if the other (i.e. during
previous failure working one) replica brick failed)

Thank you

Petr Sudoma