Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate benefits of binding the mapped memory directly to other GPU #2

Open
felixdoerre opened this issue Sep 8, 2018 · 14 comments

Comments

@felixdoerre
Copy link
Owner

felixdoerre commented Sep 8, 2018

To skip the memcpy, we could import mapped memory from the rendering GPU into the display GPU or the other way round, or use host memory and import in both.
We need to check which of those alternatives is fastest and implementable and extend the code to use this optimization if available.

However (at least on my machine) it seems that the general transfer of the image between the GPU is the bottleneck. So maybe there is a better way then memmapping and copying?

@jambonmcyeah
Copy link

Maybe look into VK_EXT_external_memory_dma_buf

@jambonmcyeah
Copy link

Never mind nvidia doesn't implement this extension

@felixdoerre
Copy link
Owner Author

Thanks for the idea! Yes this issue is exactly for collecting/finding ideas like that. I thought of VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_MAPPED_FOREIGN_MEMORY_BIT_EXT but I wasn't able to get a prototype running with that. It seems to be implemented but still has problems.

@jambonmcyeah
Copy link

jambonmcyeah commented Oct 24, 2018

Maybe VK_KHR_external_memory_fd would work. Intel and Nvidia seems to both support it

@felixdoerre
Copy link
Owner Author

I think VK_KHR_external_memory_fd is not of any use, as such a memory object can only be imported in exactly the same PhysicalDevice. (see table in https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#external-memory-handle-types-compatibility)
I have the following external_memory extensions supported:
Dedicated:

   VK_KHR_external_memory              : extension revision  1
   VK_KHR_external_memory_fd           : extension revision  1

Integrated

   VK_KHR_external_memory              : extension revision  1
   VK_KHR_external_memory_fd           : extension revision  1
   VK_EXT_external_memory_dma_buf      : extension revision  1

So there seems not way at all to circumvent the memcpy.

@rechapit
Copy link

I'm probably reading the vulkan spec and/or the problem wrong so feel free to call me out.
While it does not seem posible to for the dedicated gpu to write the frame in the integrated gpu, it does seems possible to go the other way: have the integrated GPU read from the dedicated's memory.
The frame results would be written twice, but you would using DMA instead of using memcpy would should be a lot faster and less CPU intensive

@felixdoerre
Copy link
Owner Author

felixdoerre commented Oct 29, 2018

Yes, transferring the data via DMA is certainly better if that is possible.

However VK_EXT_external_memory_dma_buf requires:

a file descriptor for a Linux dma_buf

So I would need to acquire such a dma_buf file descriptor for the nvidia-buffer object (or for the mem-mapped region of the image). Do you have any Idea how that would be possible, if I cannot acquire it from the nvidia driver?

@rechapit
Copy link

Sadly no. I did not know about the dma_buf requirement.

@felixdoerre
Copy link
Owner Author

Thanks for the resources, it's good to collect everything possibly relevant here.

As I understand int dma_buf_fd(struct dma_buf * dmabuf, int flags), this takes a struct dma_buf and turns that into a file descriptor. So this method together with the extension VK_EXT_external_memory_dma_buf of the integrated GPU I can (probably) import a struct dma_buf * info the integrated GPU. However I sill don't see any way to get either a "dma fd" or a struct dma_buf * (that dma_buf_fd could turn into an "dma fd") at all.

E.g. for nvidia-GPUs the supported extensions are mentioned here: https://developer.nvidia.com/vulkan-driver Searching for external_memory yields:

  • VK_KHR_external_memory_fd (not cross PhysicalDevice, only for sharing resources between processes)
  • VK_KHR_external_memory_win32 (not usable on linux)
  • VK_EXT_external_memory_host (supported by Nvidia explicitly only on Windows)

@snoopcatt
Copy link

Maybe ask NVidia about VK_EXT_external_memory_dma_buf?
It seems they referenced as contributors here: https://github.com/KhronosGroup/Vulkan-Docs/blob/master/appendices/VK_EXT_external_memory_dma_buf.txt

@ArchangeGabriel
Copy link
Contributor

Just for the record, dma-buf is definitively the way to go, as I think this is what PRIME does for OpenGL. And this is likely the lowest level you can do it at, and thus with the less overrides. I don’t understand the specifics, but maybe asking @aaronp24 about that Vulkan extension you would require to do so could be a good idea.

@cubanismo
Copy link

I don't have much to add here, but @aaronp24 asked me to chime in, so I'll just say you've correctly surmised we don't currently support any of the extensions necessary to make a zero-copy/dma-copy-only transfer between an NV GPU and a non-NV GPU in our Linux Vulkan driver. I don't have any roadmap to share for support of any of these extensions at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants