[Bugs] [Bug 1631247] New: Issue enabling cluster.use-compound-fops with libgfapi application running

bugzilla at redhat.com bugzilla at redhat.com
Thu Sep 20 09:58:46 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1631247

            Bug ID: 1631247
           Summary: Issue enabling cluster.use-compound-fops with libgfapi
                    application running
           Product: GlusterFS
           Version: 3.12
         Component: libgfapi
          Assignee: bugs at gluster.org
          Reporter: paolo.margara at gmail.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



Description of problem:

I'm running ovirt with libgfapi enabled with gluster 3.12.13 and when I set
"cluster.use-compound-fops" to "on" every VMs are paused due to a storage IO
error while the file system continue to be accessible through fuse client (only
libgfapi application [qemu] stop working).


Version-Release number of selected component (if applicable): 
* gluster 3.12.13
* qemu 2.10.0-21.el7_5.4.1
* ovirt 4.2.6

How reproducible:
On an ovirt 4.2.6 hc installation configured with libgfapi enabled and gluster
3.12.13 runs:

gluster volume set $vm_images_volume_name cluster.use-compound-fops on

When this command is executed every VMs are paused due to a storage IO error
while the file system continue to be accessible through fuse client (only
libgfapi application stop working). In the qemu log file I could see these
gluster related messages:

2018-09-14T11:49:37.020942Z qemu-kvm: terminating on signal 15 from pid
1513 (/usr/sbin/libvirtd)
2018-09-14T11:49:42.766431Z qemu-kvm: Failed to flush the L2 table
cache: Input/output error
2018-09-14T11:49:44.766853Z qemu-kvm: Failed to flush the refcount block
cache: Input/output error
[2018-09-14 11:49:44.869112] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-1: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869284] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-0: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869515] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-2: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869639] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-3: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869823] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-4: All subvolumes are down. Going
offline until atleast one of them comes back up.
2018-09-14 11:49:45.827+0000: shutting down, reason=destroyed


If I set "cluster.use-compound-fops" to "off" everything restart working
correctly again.




Steps to Reproduce:
1. just set "cluster.use-compound-fops" to "on" on gluster volume that host VMs
images used by qemu with libgfapi


Actual results:
if I set "cluster.use-compound-fops" to "on" every VMs runned by qemu with
libgfapi report that all subvolumes are down


Expected results:
if set "cluster.use-compound-fops" to "on" every VMs should continue to work
correctly


Additional info:
let me know if you need more info/log file to figure out the source of the
problem.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list