Investigate benefits of binding the mapped memory directly to other GPU #2

felixdoerre · 2018-09-08T18:49:44Z

To skip the memcpy, we could import mapped memory from the rendering GPU into the display GPU or the other way round, or use host memory and import in both.
We need to check which of those alternatives is fastest and implementable and extend the code to use this optimization if available.

However (at least on my machine) it seems that the general transfer of the image between the GPU is the bottleneck. So maybe there is a better way then memmapping and copying?

jambonmcyeah · 2018-10-24T21:11:53Z

Maybe look into VK_EXT_external_memory_dma_buf

jambonmcyeah · 2018-10-24T21:17:41Z

Never mind nvidia doesn't implement this extension

felixdoerre · 2018-10-24T21:20:23Z

Thanks for the idea! Yes this issue is exactly for collecting/finding ideas like that. I thought of VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_MAPPED_FOREIGN_MEMORY_BIT_EXT but I wasn't able to get a prototype running with that. It seems to be implemented but still has problems.

jambonmcyeah · 2018-10-24T21:27:16Z

Maybe VK_KHR_external_memory_fd would work. Intel and Nvidia seems to both support it

felixdoerre · 2018-10-25T21:58:25Z

I think VK_KHR_external_memory_fd is not of any use, as such a memory object can only be imported in exactly the same PhysicalDevice. (see table in https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#external-memory-handle-types-compatibility)
I have the following external_memory extensions supported:
Dedicated:

   VK_KHR_external_memory              : extension revision  1
   VK_KHR_external_memory_fd           : extension revision  1

Integrated

   VK_KHR_external_memory              : extension revision  1
   VK_KHR_external_memory_fd           : extension revision  1
   VK_EXT_external_memory_dma_buf      : extension revision  1

So there seems not way at all to circumvent the memcpy.

rechapit · 2018-10-29T17:44:01Z

I'm probably reading the vulkan spec and/or the problem wrong so feel free to call me out.
While it does not seem posible to for the dedicated gpu to write the frame in the integrated gpu, it does seems possible to go the other way: have the integrated GPU read from the dedicated's memory.
The frame results would be written twice, but you would using DMA instead of using memcpy would should be a lot faster and less CPU intensive

felixdoerre · 2018-10-29T22:03:44Z

Yes, transferring the data via DMA is certainly better if that is possible.

However VK_EXT_external_memory_dma_buf requires:

a file descriptor for a Linux dma_buf

So I would need to acquire such a dma_buf file descriptor for the nvidia-buffer object (or for the mem-mapped region of the image). Do you have any Idea how that would be possible, if I cannot acquire it from the nvidia driver?

rechapit · 2018-10-30T13:30:40Z

Sadly no. I did not know about the dma_buf requirement.

TobiasKarnat · 2018-12-15T14:07:16Z

Maybe this helps:
https://01.org/linuxgraphics/gfx-docs/drm/driver-api/dma-buf.html dma_buf_fd()
https://github.com/torvalds/linux/blob/master/drivers/dma-buf/Kconfig
https://github.com/torvalds/linux/blob/master/drivers/dma-buf/dma-buf.c

felixdoerre · 2018-12-15T15:18:09Z

Thanks for the resources, it's good to collect everything possibly relevant here.

As I understand int dma_buf_fd(struct dma_buf * dmabuf, int flags), this takes a struct dma_buf and turns that into a file descriptor. So this method together with the extension VK_EXT_external_memory_dma_buf of the integrated GPU I can (probably) import a struct dma_buf * info the integrated GPU. However I sill don't see any way to get either a "dma fd" or a struct dma_buf * (that dma_buf_fd could turn into an "dma fd") at all.

E.g. for nvidia-GPUs the supported extensions are mentioned here: https://developer.nvidia.com/vulkan-driver Searching for external_memory yields:

VK_KHR_external_memory_fd (not cross PhysicalDevice, only for sharing resources between processes)
VK_KHR_external_memory_win32 (not usable on linux)
VK_EXT_external_memory_host (supported by Nvidia explicitly only on Windows)

snoopcatt · 2018-12-16T06:20:21Z

Maybe ask NVidia about VK_EXT_external_memory_dma_buf?
It seems they referenced as contributors here: https://github.com/KhronosGroup/Vulkan-Docs/blob/master/appendices/VK_EXT_external_memory_dma_buf.txt

TobiasKarnat · 2018-12-29T23:50:38Z

I also found this:
https://devtalk.nvidia.com/default/topic/1030669/jetson-tx1/trying-to-process-with-opengl-an-eglimage-created-from-a-dmabuf_fd-/
https://docs.nvidia.com/jetson/archives/l4t-multimedia-archived/l4t-multimedia-271/group__ee__nvbuffering__group.html#gab159c94c574f75a3d7913bef8352722a

ArchangeGabriel · 2019-07-17T22:35:32Z

Just for the record, dma-buf is definitively the way to go, as I think this is what PRIME does for OpenGL. And this is likely the lowest level you can do it at, and thus with the less overrides. I don’t understand the specifics, but maybe asking @aaronp24 about that Vulkan extension you would require to do so could be a good idea.

cubanismo · 2019-07-17T23:33:43Z

I don't have much to add here, but @aaronp24 asked me to chime in, so I'll just say you've correctly surmised we don't currently support any of the extensions necessary to make a zero-copy/dma-copy-only transfer between an NV GPU and a non-NV GPU in our Linux Vulkan driver. I don't have any roadmap to share for support of any of these extensions at the moment.

felixdoerre mentioned this issue Oct 2, 2018

WoW under Wine support and performance #6

Closed

felixdoerre mentioned this issue Mar 1, 2019

Use VK external fd extensions (they are core 1.1 spec) #25

Closed

silikeite mentioned this issue Jun 22, 2019

Dota2 does not work if Steam is on #42

Closed

felixdoerre mentioned this issue Apr 21, 2020

problems: vulkaninfo segfault, vkcube segfault, vkcubepp works, other random issues #61

Closed

zimudec mentioned this issue Oct 8, 2020

Last steam-manjaro version with pvkrun, does not run the dedicated gpu and vulkan #77

Closed

jaypeche mentioned this issue Nov 20, 2023

Vulkan >=1.3.268 - Vulkan headers files was changed, primus_vk won't compile on Gentoo (fix) #93

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate benefits of binding the mapped memory directly to other GPU #2

Investigate benefits of binding the mapped memory directly to other GPU #2

felixdoerre commented Sep 8, 2018 •

edited

Loading

jambonmcyeah commented Oct 24, 2018

jambonmcyeah commented Oct 24, 2018

felixdoerre commented Oct 24, 2018

jambonmcyeah commented Oct 24, 2018 •

edited

Loading

felixdoerre commented Oct 25, 2018

rechapit commented Oct 29, 2018

felixdoerre commented Oct 29, 2018 •

edited

Loading

rechapit commented Oct 30, 2018

TobiasKarnat commented Dec 15, 2018

felixdoerre commented Dec 15, 2018

snoopcatt commented Dec 16, 2018

TobiasKarnat commented Dec 29, 2018

ArchangeGabriel commented Jul 17, 2019

cubanismo commented Jul 17, 2019

Investigate benefits of binding the mapped memory directly to other GPU #2

Investigate benefits of binding the mapped memory directly to other GPU #2

Comments

felixdoerre commented Sep 8, 2018 • edited Loading

jambonmcyeah commented Oct 24, 2018

jambonmcyeah commented Oct 24, 2018

felixdoerre commented Oct 24, 2018

jambonmcyeah commented Oct 24, 2018 • edited Loading

felixdoerre commented Oct 25, 2018

rechapit commented Oct 29, 2018

felixdoerre commented Oct 29, 2018 • edited Loading

rechapit commented Oct 30, 2018

TobiasKarnat commented Dec 15, 2018

felixdoerre commented Dec 15, 2018

snoopcatt commented Dec 16, 2018

TobiasKarnat commented Dec 29, 2018

ArchangeGabriel commented Jul 17, 2019

cubanismo commented Jul 17, 2019

felixdoerre commented Sep 8, 2018 •

edited

Loading

jambonmcyeah commented Oct 24, 2018 •

edited

Loading

felixdoerre commented Oct 29, 2018 •

edited

Loading