<div dir="ltr"><div dir="ltr">i Strahil<div><br></div><div>thanks a million for your reply.</div><div><br></div><div>I mainly thought that disperse volume where not supported because of the complexity of managing them (due to the various possible combinations of number of hosts / bricks and redundancy); however I assumed that once implemented and managed separately they could be used as VM storage for oVirt -- given they are in general supported by RHGS.</div><div><br></div><div>When you say they will not be optimal are you referring mainly to performance considerations? We did plenty of testing, and in terms of performance didn't have issues even with I/O intensive workloads (using SSDs, I had issues with spinning disks).</div><div><br></div><div>Replica 3 with arbiter is the other possible options for us, but clearly is less efficient in terms of storage usage than the current disperse 4+2 volumes, and the main issue for us is that having two servers down (out of the three in each replica) will create a service outage -- while with a disperse 4+2 combination we can withstand two servers down out of six (e.g. one has been brought down in maintenance and at that time another server has an issue). That's the reason I am keen to have it working with disperse -- apart from the specific issue with snapshot deletion, everything seems to work very well.</div><div><br></div><div>In regards to the options -- apologies I had applied the group with the "<font face="monospace">gluster volume set SSD_Storage group virt</font><font face="arial, sans-serif">" command but for some reason it doesn't list the options in the "info". I have re-applied them individually and the results are the same. See below for the list of options I am using:</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="monospace">Options Reconfigured:<br>storage.owner-gid: 36<br>storage.owner-uid: 36<br>performance.client-io-threads: on<br>server.event-threads: 4<br>client.event-threads: 4<br>cluster.choose-local: off<br>user.cifs: off<br>features.shard: on<br>cluster.shd-wait-qlength: 10000<br>cluster.locking-scheme: granular<br>cluster.data-self-heal-algorithm: full<br>cluster.server-quorum-type: server<br>cluster.quorum-type: auto<br>cluster.eager-lock: enable<br>network.remote-dio: enable<br>performance.low-prio-threads: 32<br>performance.io-cache: off<br>performance.read-ahead: off<br>performance.quick-read: off<br>storage.fips-mode-rchecksum: on<br>nfs.disable: on</font><font face="arial, sans-serif"><br></font></div></div><div><br></div><div>Unfortunately we have the issue with all VMs -- doesn't seem to depend on the allocation of storage either (thin provisioned or pre allocated).</div><div><br></div><div>Thanks!</div><font color="#888888"><div>Marco</div></font><div class="gmail-yj6qo gmail-ajU" style="outline:none;padding:10px 0px;width:22px;margin:2px 0px 0px"><br class="gmail-Apple-interchange-newline"></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, 30 Jun 2020 at 05:12, Strahil Nikolov <<a href="mailto:hunter86_bg@yahoo.com">hunter86_bg@yahoo.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hey Marco,<br>
<br>
have you wondered why non-replifa volumes are not supported for oVirt (or the paid downstreams)? Also disperse volume will not be optimal for your needs.<br>
<br>
Have you thought about replica 3 with an arbiter ?<br>
<br>
Now on the topic.<br>
I don't see the optimize for virt option which you also need to apply (which involves sharding too). You can find them in the gluster's group dir (it was someething like /var/lib/glusterd/groups/virt).<br>
<br>
With unsupported volume type and without any option the oVirt community recommend, you can and most probably feel bad situations.<br>
<br>
Please, set the virt group options and try again.<br>
<br>
Does the issue occur on another VM ?<br>
<br>
<br>
Best Regards,<br>
Strahil Nikolov<br>
<br>
<br>
На 30 юни 2020 г. 1:59:36 GMT+03:00, Marco Fais <<a href="mailto:evilmf@gmail.com" target="_blank">evilmf@gmail.com</a>> написа:<br>
>Hi,<br>
><br>
>I am having a problem recently with Gluster disperse volumes and live<br>
>merge<br>
>on qemu-kvm.<br>
><br>
>I am using Gluster as a storage backend of an oVirt cluster; we are<br>
>planning to use VM snapshots in the process of taking daily backups on<br>
>the<br>
>VMs and we are encountering issues when the VMs are stored in a<br>
>distributed-disperse volume.<br>
><br>
>First of all, I am using gluster 7.5, libvirt 6.0, qemu 4.2 and oVirt<br>
>4.4.0<br>
>on CentOS 8.1<br>
><br>
>The sequence of events is the following:<br>
><br>
>1) On a running VM, create a new snapshot<br>
><br>
>The operation completes successfully, however I can observe the<br>
>following<br>
>errors on the gluster logs:<br>
><br>
>[2020-06-29 21:54:18.942422] I [MSGID: 109066]<br>
>[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming<br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta.new<br>
>(a89f2ccb-be41-4ff7-bbaf-abb786e76bc7)<br>
>(hash=SSD_Storage-disperse-1/cache=SSD_Storage-disperse-1) =><br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta<br>
>(f55c1f35-63fa-4d27-9aa9-09b60163e565)<br>
>(hash=SSD_Storage-disperse-2/cache=SSD_Storage-disperse-1)<br>
>[2020-06-29 21:54:18.947273] W [MSGID: 122019]<br>
>[ec-helpers.c:401:ec_loc_gfid_check] 0-SSD_Storage-disperse-2:<br>
>Mismatching<br>
>GFID's in loc<br>
>[2020-06-29 21:54:18.947290] W [MSGID: 109002]<br>
>[dht-rename.c:1019:dht_rename_links_create_cbk] 0-SSD_Storage-dht:<br>
>link/file<br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta<br>
>on SSD_Storage-disperse-2 failed [Input/output error]<br>
>[2020-06-29 21:54:19.197482] I [MSGID: 109066]<br>
>[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming<br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta.new<br>
>(b4888032-3758-4f62-a4ae-fb48902f83d2)<br>
>(hash=SSD_Storage-disperse-4/cache=SSD_Storage-disperse-4) =><br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta<br>
>((null)) (hash=SSD_Storage-disperse-4/cache=<nul>)<br>
><br>
>2) Once the snapshot has been created, try to delete it while the VM is<br>
>running<br>
><br>
>The above seems to be running for a couple of seconds and then suddenly<br>
>the<br>
>qemu-kvm process crashes. On the qemu VM logs I can see the following:<br>
><br>
>Unexpected error in raw_check_lock_bytes() at block/file-posix.c:811:<br>
>2020-06-29T21:56:23.933603Z qemu-kvm: Failed to get shared "write" lock<br>
><br>
>At the same time, the gluster logs report the following:<br>
><br>
>[2020-06-29 21:56:23.850417] I [MSGID: 109066]<br>
>[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming<br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta.new<br>
>(1999a713-a0ed-45fb-8ab7-7dbda6d02a78)<br>
>(hash=SSD_Storage-disperse-1/cache=SSD_Storage-disperse-1) =><br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta<br>
>(a89f2ccb-be41-4ff7-bbaf-abb786e76bc7)<br>
>(hash=SSD_Storage-disperse-2/cache=SSD_Storage-disperse-1)<br>
>[2020-06-29 21:56:23.855027] W [MSGID: 122019]<br>
>[ec-helpers.c:401:ec_loc_gfid_check] 0-SSD_Storage-disperse-2:<br>
>Mismatching<br>
>GFID's in loc<br>
>[2020-06-29 21:56:23.855045] W [MSGID: 109002]<br>
>[dht-rename.c:1019:dht_rename_links_create_cbk] 0-SSD_Storage-dht:<br>
>link/file<br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/64c038a4-5fe4-4f57-8b1c-bab38ae5c5bb.meta<br>
>on SSD_Storage-disperse-2 failed [Input/output error]<br>
>[2020-06-29 21:56:23.922638] I [MSGID: 109066]<br>
>[dht-rename.c:1951:dht_rename] 0-SSD_Storage-dht: renaming<br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta.new<br>
>(e5c578b3-b91a-4263-a7e3-40f9c7e3628b)<br>
>(hash=SSD_Storage-disperse-4/cache=SSD_Storage-disperse-4) =><br>
>/58e8dff0-3dfd-4554-9999-b8eb7744ce1b/images/998f0b18-1904-47f3-8cfb-a73ad063ab83/a54793c1-c804-425d-894e-2dfe7a63af4b.meta<br>
>(b4888032-3758-4f62-a4ae-fb48902f83d2)<br>
>(hash=SSD_Storage-disperse-4/cache=SSD_Storage-disperse-4)<br>
>[2020-06-29 21:56:26.017309] E<br>
>[fuse-bridge.c:227:check_and_dump_fuse_W]<br>
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x133)[0x7fd4fa4d6a53]<br>
>(--><br>
>/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0x8e82)[0x7fd4f64cee82]<br>
>(--><br>
>/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0xa072)[0x7fd4f64d0072]<br>
>(--><br>
>/lib64/libpthread.so.0(+0x82de)[0x7fd4f90582de] (--><br>
>/lib64/libc.so.6(clone+0x43)[0x7fd4f88aa133] ))))) 0-glusterfs-fuse:<br>
>writing to fuse device failed: No such file or directory<br>
>[2020-06-29 21:56:26.017421] E<br>
>[fuse-bridge.c:227:check_and_dump_fuse_W]<br>
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x133)[0x7fd4fa4d6a53]<br>
>(--><br>
>/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0x8e82)[0x7fd4f64cee82]<br>
>(--><br>
>/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0xa072)[0x7fd4f64d0072]<br>
>(--><br>
>/lib64/libpthread.so.0(+0x82de)[0x7fd4f90582de] (--><br>
>/lib64/libc.so.6(clone+0x43)[0x7fd4f88aa133] ))))) 0-glusterfs-fuse:<br>
>writing to fuse device failed: No such file or directory<br>
>[2020-06-29 21:56:26.017524] E<br>
>[fuse-bridge.c:227:check_and_dump_fuse_W]<br>
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x133)[0x7fd4fa4d6a53]<br>
>(--><br>
>/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0x8e82)[0x7fd4f64cee82]<br>
>(--><br>
>/usr/lib64/glusterfs/7.5/xlator/mount/fuse.so(+0xa072)[0x7fd4f64d0072]<br>
>(--><br>
>/lib64/libpthread.so.0(+0x82de)[0x7fd4f90582de] (--><br>
>/lib64/libc.so.6(clone+0x43)[0x7fd4f88aa133] ))))) 0-glusterfs-fuse:<br>
>writing to fuse device failed: No such file or directory<br>
><br>
>Initially I thought this was a qemu-kvm issue; however the above works<br>
>perfectly on a distributed-replicated volume on exactly the same HW,<br>
>software and gluster volume options.<br>
>Also, the issue can be replicated 100% of the times -- every time I try<br>
>to<br>
>delete the snapshot the process crashes.<br>
><br>
>Not sure what's the best way to proceed -- I have tried to file a bug<br>
>but<br>
>unfortunately didn't get any traction.<br>
>Gluster volume info here:<br>
><br>
>Volume Name: SSD_Storage<br>
>Type: Distributed-Disperse<br>
>Volume ID: 4e1bf45d-9ecd-44f2-acde-dd338e18379c<br>
>Status: Started<br>
>Snapshot Count: 0<br>
>Number of Bricks: 6 x (4 + 2) = 36<br>
>Transport-type: tcp<br>
>Bricks:<br>
>Brick1: cld-cnvirt-h01-storage:/bricks/vm_b1/brick<br>
>Brick2: cld-cnvirt-h02-storage:/bricks/vm_b1/brick<br>
>Brick3: cld-cnvirt-h03-storage:/bricks/vm_b1/brick<br>
>Brick4: cld-cnvirt-h04-storage:/bricks/vm_b1/brick<br>
>Brick5: cld-cnvirt-h05-storage:/bricks/vm_b1/brick<br>
>Brick6: cld-cnvirt-h06-storage:/bricks/vm_b1/brick<br>
>Brick7: cld-cnvirt-h01-storage:/bricks/vm_b2/brick<br>
>Brick8: cld-cnvirt-h02-storage:/bricks/vm_b2/brick<br>
>Brick9: cld-cnvirt-h03-storage:/bricks/vm_b2/brick<br>
>Brick10: cld-cnvirt-h04-storage:/bricks/vm_b2/brick<br>
>Brick11: cld-cnvirt-h05-storage:/bricks/vm_b2/brick<br>
>Brick12: cld-cnvirt-h06-storage:/bricks/vm_b2/brick<br>
>Brick13: cld-cnvirt-h01-storage:/bricks/vm_b3/brick<br>
>Brick14: cld-cnvirt-h02-storage:/bricks/vm_b3/brick<br>
>Brick15: cld-cnvirt-h03-storage:/bricks/vm_b3/brick<br>
>Brick16: cld-cnvirt-h04-storage:/bricks/vm_b3/brick<br>
>Brick17: cld-cnvirt-h05-storage:/bricks/vm_b3/brick<br>
>Brick18: cld-cnvirt-h06-storage:/bricks/vm_b3/brick<br>
>Brick19: cld-cnvirt-h01-storage:/bricks/vm_b4/brick<br>
>Brick20: cld-cnvirt-h02-storage:/bricks/vm_b4/brick<br>
>Brick21: cld-cnvirt-h03-storage:/bricks/vm_b4/brick<br>
>Brick22: cld-cnvirt-h04-storage:/bricks/vm_b4/brick<br>
>Brick23: cld-cnvirt-h05-storage:/bricks/vm_b4/brick<br>
>Brick24: cld-cnvirt-h06-storage:/bricks/vm_b4/brick<br>
>Brick25: cld-cnvirt-h01-storage:/bricks/vm_b5/brick<br>
>Brick26: cld-cnvirt-h02-storage:/bricks/vm_b5/brick<br>
>Brick27: cld-cnvirt-h03-storage:/bricks/vm_b5/brick<br>
>Brick28: cld-cnvirt-h04-storage:/bricks/vm_b5/brick<br>
>Brick29: cld-cnvirt-h05-storage:/bricks/vm_b5/brick<br>
>Brick30: cld-cnvirt-h06-storage:/bricks/vm_b5/brick<br>
>Brick31: cld-cnvirt-h01-storage:/bricks/vm_b6/brick<br>
>Brick32: cld-cnvirt-h02-storage:/bricks/vm_b6/brick<br>
>Brick33: cld-cnvirt-h03-storage:/bricks/vm_b6/brick<br>
>Brick34: cld-cnvirt-h04-storage:/bricks/vm_b6/brick<br>
>Brick35: cld-cnvirt-h05-storage:/bricks/vm_b6/brick<br>
>Brick36: cld-cnvirt-h06-storage:/bricks/vm_b6/brick<br>
>Options Reconfigured:<br>
>nfs.disable: on<br>
>storage.fips-mode-rchecksum: on<br>
>performance.strict-o-direct: on<br>
>network.remote-dio: off<br>
>storage.owner-uid: 36<br>
>storage.owner-gid: 36<br>
>network.ping-timeout: 30<br>
><br>
>I have tried many different options but unfortunately have the same<br>
>results. I have the same problem in three different clusters (same<br>
>versions).<br>
><br>
>Any suggestions?<br>
><br>
>Thanks,<br>
>Marco<br>
</blockquote></div>