[Gluster-users] Problems with qemu and disperse volumes (live merge)

Mon Jun 29 22:59:36 UTC 2020

Hi,

I am having a problem recently with Gluster disperse volumes and live merge
on qemu-kvm.

I am using Gluster as a storage backend of an oVirt cluster; we are
planning to use VM snapshots in the process of taking daily backups on the
VMs and we are encountering issues when the VMs are stored in a
distributed-disperse volume.

First of all, I am using gluster 7.5, libvirt 6.0, qemu 4.2 and oVirt 4.4.0
on CentOS 8.1

The sequence of events is the following:

1) On a running VM, create a new snapshot

The operation completes successfully, however I can observe the following
errors on the gluster logs:

[2020-06-29 21:54:18.942422] I [MSGID: 109066]
[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta.new
(a89f2ccb-be41-4ff7-bbaf-abb786e76bc7)
(hash=SSD_Storage-disperse-1/cache=SSD_Storage-disperse-1) =>
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta
(f55c1f35-63fa-4d27-9aa9-09b60163e565)
(hash=SSD_Storage-disperse-2/cache=SSD_Storage-disperse-1)
[2020-06-29 21:54:18.947273] W [MSGID: 122019]
[ec-helpers.c:401:ec_loc_gfid_check] 0-SSD_Storage-disperse-2: Mismatching
GFID's in loc
[2020-06-29 21:54:18.947290] W [MSGID: 109002]
[dht-rename.c:1019:dht_rename_links_create_cbk] 0-SSD_Storage-dht:
link/file
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta
on SSD_Storage-disperse-2 failed [Input/output error]
[2020-06-29 21:54:19.197482] I [MSGID: 109066]
[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta.new
(b4888032-3758-4f62-a4ae-fb48902f83d2)
(hash=SSD_Storage-disperse-4/cache=SSD_Storage-disperse-4) =>
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta
((null)) (hash=SSD_Storage-disperse-4/cache=<nul>)

2) Once the snapshot has been created, try to delete it while the VM is
running

The above seems to be running for a couple of seconds and then suddenly the
qemu-kvm process crashes. On the qemu VM logs I can see the following:

Unexpected error in raw_check_lock_bytes() at block/file-posix.c:811:
2020-06-29T21:56:23.933603Z qemu-kvm: Failed to get shared "write" lock

At the same time, the gluster logs report the following:

[2020-06-29 21:56:23.850417] I [MSGID: 109066]
[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta.new
(1999a713-a0ed-45fb-8ab7-7dbda6d02a78)
(hash=SSD_Storage-disperse-1/cache=SSD_Storage-disperse-1) =>
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta
(a89f2ccb-be41-4ff7-bbaf-abb786e76bc7)
(hash=SSD_Storage-disperse-2/cache=SSD_Storage-disperse-1)
[2020-06-29 21:56:23.855027] W [MSGID: 122019]
[ec-helpers.c:401:ec_loc_gfid_check] 0-SSD_Storage-disperse-2: Mismatching
GFID's in loc
[2020-06-29 21:56:23.855045] W [MSGID: 109002]
[dht-rename.c:1019:dht_rename_links_create_cbk] 0-SSD_Storage-dht:
link/file
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta
on SSD_Storage-disperse-2 failed [Input/output error]
[2020-06-29 21:56:23.922638] I [MSGID: 109066]
[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta.new
(e5c578b3-b91a-4263-a7e3-40f9c7e3628b)
(hash=SSD_Storage-disperse-4/cache=SSD_Storage-disperse-4) =>
/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta
(b4888032-3758-4f62-a4ae-fb48902f83d2)
(hash=SSD_Storage-disperse-4/cache=SSD_Storage-disperse-4)
[2020-06-29 21:56:26.017309] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x133)[0x7fd4fa4d6a53] (-->
/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0x8e82)[0x7fd4f64cee82] (-->
/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0xa072)[0x7fd4f64d0072] (-->
/lib64/libpthread.so.0(+0x82de)[0x7fd4f90582de] (-->
/lib64/libc.so.6(clone+0x43)[0x7fd4f88aa133] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory
[2020-06-29 21:56:26.017421] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x133)[0x7fd4fa4d6a53] (-->
/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0x8e82)[0x7fd4f64cee82] (-->
/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0xa072)[0x7fd4f64d0072] (-->
/lib64/libpthread.so.0(+0x82de)[0x7fd4f90582de] (-->
/lib64/libc.so.6(clone+0x43)[0x7fd4f88aa133] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory
[2020-06-29 21:56:26.017524] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x133)[0x7fd4fa4d6a53] (-->
/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0x8e82)[0x7fd4f64cee82] (-->
/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0xa072)[0x7fd4f64d0072] (-->
/lib64/libpthread.so.0(+0x82de)[0x7fd4f90582de] (-->
/lib64/libc.so.6(clone+0x43)[0x7fd4f88aa133] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory

Initially I thought this was a qemu-kvm issue; however the above works
perfectly on a distributed-replicated volume on exactly the same HW,
software and gluster volume options.
Also, the issue can be replicated 100% of the times -- every time I try to
delete the snapshot the process crashes.

Not sure what's the best way to proceed -- I have tried to file a bug but
unfortunately didn't get any traction.
Gluster volume info here:

Volume Name: SSD_Storage
Type: Distributed-Disperse
Volume ID: 4e1bf45d-9ecd-44f2-acde-dd338e18379c
Status: Started
Snapshot Count: 0
Number of Bricks: 6 x (4 + 2) = 36
Transport-type: tcp
Bricks:
Brick1: cld-cnvirt-h01-storage:/bricks/vm_b1/brick
Brick2: cld-cnvirt-h02-storage:/bricks/vm_b1/brick
Brick3: cld-cnvirt-h03-storage:/bricks/vm_b1/brick
Brick4: cld-cnvirt-h04-storage:/bricks/vm_b1/brick
Brick5: cld-cnvirt-h05-storage:/bricks/vm_b1/brick
Brick6: cld-cnvirt-h06-storage:/bricks/vm_b1/brick
Brick7: cld-cnvirt-h01-storage:/bricks/vm_b2/brick
Brick8: cld-cnvirt-h02-storage:/bricks/vm_b2/brick
Brick9: cld-cnvirt-h03-storage:/bricks/vm_b2/brick
Brick10: cld-cnvirt-h04-storage:/bricks/vm_b2/brick
Brick11: cld-cnvirt-h05-storage:/bricks/vm_b2/brick
Brick12: cld-cnvirt-h06-storage:/bricks/vm_b2/brick
Brick13: cld-cnvirt-h01-storage:/bricks/vm_b3/brick
Brick14: cld-cnvirt-h02-storage:/bricks/vm_b3/brick
Brick15: cld-cnvirt-h03-storage:/bricks/vm_b3/brick
Brick16: cld-cnvirt-h04-storage:/bricks/vm_b3/brick
Brick17: cld-cnvirt-h05-storage:/bricks/vm_b3/brick
Brick18: cld-cnvirt-h06-storage:/bricks/vm_b3/brick
Brick19: cld-cnvirt-h01-storage:/bricks/vm_b4/brick
Brick20: cld-cnvirt-h02-storage:/bricks/vm_b4/brick
Brick21: cld-cnvirt-h03-storage:/bricks/vm_b4/brick
Brick22: cld-cnvirt-h04-storage:/bricks/vm_b4/brick
Brick23: cld-cnvirt-h05-storage:/bricks/vm_b4/brick
Brick24: cld-cnvirt-h06-storage:/bricks/vm_b4/brick
Brick25: cld-cnvirt-h01-storage:/bricks/vm_b5/brick
Brick26: cld-cnvirt-h02-storage:/bricks/vm_b5/brick
Brick27: cld-cnvirt-h03-storage:/bricks/vm_b5/brick
Brick28: cld-cnvirt-h04-storage:/bricks/vm_b5/brick
Brick29: cld-cnvirt-h05-storage:/bricks/vm_b5/brick
Brick30: cld-cnvirt-h06-storage:/bricks/vm_b5/brick
Brick31: cld-cnvirt-h01-storage:/bricks/vm_b6/brick
Brick32: cld-cnvirt-h02-storage:/bricks/vm_b6/brick
Brick33: cld-cnvirt-h03-storage:/bricks/vm_b6/brick
Brick34: cld-cnvirt-h04-storage:/bricks/vm_b6/brick
Brick35: cld-cnvirt-h05-storage:/bricks/vm_b6/brick
Brick36: cld-cnvirt-h06-storage:/bricks/vm_b6/brick
Options Reconfigured:
nfs.disable: on
storage.fips-mode-rchecksum: on
performance.strict-o-direct: on
network.remote-dio: off
storage.owner-uid: 36
storage.owner-gid: 36
network.ping-timeout: 30

I have tried many different options but unfortunately have the same
results. I have the same problem in three different clusters (same
versions).

Any suggestions?

Thanks,
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200629/b47e6cc2/attachment.html>