[Gluster-users] Problems with qemu and disperse volumes (live merge)

Thu Jul 2 14:45:38 UTC 2020

На 2 юли 2020 г. 16:33:51 GMT+03:00, Marco Fais <evilmf at gmail.com> написа:
>Hi Strahil,
>
>WARNING: As you enabled sharding - NEVER DISABLE SHARDING, EVER !
>>
>
>Thanks -- good to be reminded :)
>
>
>> >When you say they will not be optimal are you referring mainly to
>> >performance considerations? We did plenty of testing, and in terms
>of
>> >performance didn't have issues even with I/O intensive workloads
>(using
>> >SSDs, I had issues with spinning disks).
>>
>> Yes, the client side has  to connect to 6 bricks (4+2) at a time  and
>> calculate the data in order to obtain the necessary information.Same
>is
>> valid for writing.
>> If you need to conserve space, you can test VDO without compression
>(of
>> even with it).
>>
>
>Understood -- will explore VDO. Storage usage efficiency is less
>important
>than fault tolerance or performance for us -- disperse volumes seemed
>to
>tick all the boxes so we looked at them primarily.
>But clearly I had missed that they are not used as mainstream VM
>storage
>for oVirt (I did know they weren't supported, but as explained thought
>was
>more on the management side).
>
>
>>
>> Also  with replica  volumes,  you can use 'choose-local'  /in case
>you
>> have faster than the network storage (like  NVMe)/ and increase the
>read
>> speed. Of course  this feature is useful for Hyperconverged setup
>(gluster
>> + ovirt on the same node).
>>
>
>Will explore this option as well, thanks for the suggestion.
>
>
>> If you were using ovirt 4.3 ,  I  would  recommend you to focus  on
>> gluster. Yet,  you  use  oVirt 4.4 which is quite  newer and it needs
> some
>> polishing.
>>
>
>Ovirt 4.3.9 (using the older Centos 7 qemu/libvirt) unfortunately had
>similar issues with the disperse volumes. Not sure if exactly the same,
>as
>never looked deeper into it, but the results were similar.
>Ovirt 4.4.0 has some issues with snapshot deletion that are independent
>from Gluster (I have raised the issue here,
>https://bugzilla.redhat.com/show_bug.cgi?id=1840414, should be sorted
>with
>4.4.2 I guess), so at the moment it only works with the "testing" AV
>repo.

In such case I can recommend you to:
1. Ensure you have enough space on all bricks for the logs (/var/log/gluster). Several gigs should be OK
2. Enable all logs to 'TRACE' . Red Hat's documentation on the topic is quite good:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/configuring_the_log_level
3. Reproduce the issue on a fresh VM (never done snapshot deletion)
4. Disable (switch to info)  all logs as per the link in point 2

The logs will be spread among all nodes. If you have remote logging available, you can also use it for analysis of the logs.

Most probably the brick logs can provide useful information.

>
>> Check ovirt  engine  logs (on the HostedEngine VM or your standalone
>> engine) ,  vdsm logs  on the host that was running the VM  and next -
>check
>> the brick  logs.
>>
>
>Will do.
>
>Thanks,
>Marco

About VDO - it might require some tuning and even afterwards it won't be very performant, so it depends on your needs.

Best Regards,
Strahil Nikolov