[GEDI] [RFC v4 11/11] virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint

David Hildenbrand david at redhat.com
Thu Aug 25 07:43:16 UTC 2022

On 23.08.22 21:22, Stefan Hajnoczi wrote:
> On Tue, Aug 23, 2022 at 10:01:59AM +0200, David Hildenbrand wrote:
>> On 23.08.22 00:24, Stefan Hajnoczi wrote:
>>> Register guest RAM using BlockRAMRegistrar and set the
>>> BDRV_REQ_REGISTERED_BUF flag so block drivers can optimize memory
>>> accesses in I/O requests.
>>> This is for vdpa-blk, vhost-user-blk, and other I/O interfaces that rely
>>> on DMA mapping/unmapping.
>> Can you explain why we're monitoring RAMRegistrar to hook into "guest
>> RAM" and not go the usual path of the MemoryListener?
> The requirements are similar to VFIO, which uses RAMBlockNotifier. We

Only VFIO NVME uses RAMBlockNotifier. Ordinary VFIO uses the MemoryListener.

Maybe the difference is that ordinary VFIO has to replicate the actual
guest physical memory layout, and VFIO NVME is only interested in
possible guest RAM inside guest physical memory.

> need to learn about all guest RAM because that's where I/O buffers are
> located.
> Do you think RAMBlockNotifier should be avoided?

I assume it depends on the use case. For saying "this might be used for
I/O" it might be good enough I guess.

>> What will BDRV_REQ_REGISTERED_BUF actually do? Pin all guest memory in
>> the worst case such as io_uring fixed buffers would do ( I hope not ).
> BLK_REQ_REGISTERED_BUF is a hint that no bounce buffer is necessary
> because the I/O buffer is located in memory that was previously
> registered with bdrv_registered_buf().
> The RAMBlockNotifier calls bdrv_register_buf() to let the libblkio
> driver know about RAM. Some libblkio drivers ignore this hint, io_uring
> may use the fixed buffers feature, vhost-user sends the shared memory
> file descriptors to the vhost device server, and VFIO/vhost may pin
> pages.
> So the blkio block driver doesn't add anything new, it's the union of
> VFIO/vhost/vhost-user/etc memory requirements.

The issue is if that backend pins memory inside any of these regions.
Then, you're instantly incompatible to anything the relies on sparse
RAMBlocks, such as memory ballooning or virtio-mem, and have to properly
fence it.

In that case, you'd have to successfully trigger
ram_block_discard_disable(true) first, before pinning. Who would do that
now conditionally, just like e.g., VFIO does?

io_uring fixed buffers would be one such example that pins memory and is
problematic. vfio (unless on s390x) is another example, as you point out.

This has to be treated with care. Another thing to consider is that
different backends might only support a limited number of such regions.
I assume there is a way for QEMU to query this limit upfront? It might
be required for memory hot(un)plug to figure out how many memory slots
we actually have (for ordinary DIMMs, and if we ever want to make this
compatible to virtio-mem, it might be required as well when the backend
pins memory).


David / dhildenb

More information about the integration mailing list