Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove explicit usage of linear modifier #240

Closed
wants to merge 2 commits into from
Closed

Conversation

mahkoh
Copy link
Owner

@mahkoh mahkoh commented Jul 28, 2024

No description provided.

@krakow10
Copy link

krakow10 commented Jul 28, 2024

running
cargo run --release --features it -- run-tests

The error from before is changed to this:
Compositor could not take a screenshot: Could not import dmabuf: Cloud not create a gbm buffer: Invalid argument (os error 22)

Nvidia driver just doesn't wanna I guess.

Edit:
Note that I am testing with the Nvidia beta driver 560.28.03

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 28, 2024

Try again.

@krakow10
Copy link

Compositor could not take a screenshot: Could not map dmabuf: Could not map bo: No such file or directory (os error 2)

@mahkoh mahkoh force-pushed the jorth/linear-modifier branch from 277a3f1 to 4eecb1a Compare July 28, 2024 19:08
@mahkoh
Copy link
Owner Author

mahkoh commented Jul 28, 2024

Try again.

@krakow10
Copy link

All tests seem to run to completion now

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 28, 2024

What are the contents of /dev/dri?

@krakow10
Copy link

image
by-path card0 renderD128

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 28, 2024

@cubanismo in the scenario above we

Open /dev/dri/renderD128 and then try to create a buffer via gbm_bo_create_with_modifiers2(800, 600, XRGB888, &LINEAR_MODIFIER, 1, GBM_BO_USE_RENDERING). This fails.

If we change this to gbm_bo_create_with_modifiers2(800, 600, XRGB888, NULL, 0, GBM_BO_USE_RENDERING), it succeeds. But importing such a dmabuf via gbm_bo_import(GBM_BO_IMPORT_FD_MODIFIER) with the invalid modifier fails. Changing this to the explicit modifier chosen by your driver succeeds.

However, even then, trying to gbm_bo_map(GBM_BO_TRANSFER_READ_WRITE) this buffer fails. Changing the original usage to GBM_BO_USE_WRITE makes it succeed.

Are these additional restrictions of your driver that are documented somewhere? All of these issues seem like bugs or at least things where you significantly diverge from mesa.

@mahkoh mahkoh force-pushed the jorth/linear-modifier branch from 4eecb1a to 1520232 Compare July 28, 2024 19:32
@mahkoh
Copy link
Owner Author

mahkoh commented Jul 28, 2024

@krakow10 I've reverted the two workarounds that I mentioned in the comment above. They were just hacks that will not work with mesa or will cause significant problems with mesa drivers.

Still, I'm wondering if jay screenshot works with the current version from this MR. Unlike the tests, this should not make use of the problematic operations mentioned above. When trying this, remember to use the same binary for jay screenshot.

@krakow10
Copy link

krakow10 commented Jul 28, 2024

During my testing in #239 (comment) I was able to test jay screenshot with both versions.

linear-modifier base branch jay screenshot:

[2024-07-28T22:27:16.664Z ERROR jay::cli::screenshot] Could not map dmabuf: Could not map bo: Permission denied (os error 13)

linear-modifier base branch sudo jay screenshot:

[2024-07-28T22:29:56.017Z ERROR jay::tools::tool_client] Could not create a tool client: XDG_RUNTIME_DIR is not set

linear-modifier with test patch 4eecb1a:

[2024-07-28T22:04:37.814Z ERROR jay::cli::screenshot] Could not take a screenshot: Render context supports neither linear nor invalid modifier for XRGB8888 rendering

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 29, 2024

linear-modifier with test patch 4eecb1a:

This does not sound correct. That error message no longer exists in that branch.

@krakow10
Copy link

I revived the commit from my local git to see what would happen, but maybe I made a mistake and it just built the master branch

@cubanismo
Copy link

Open /dev/dri/renderD128 and then try to create a buffer via gbm_bo_create_with_modifiers2(800, 600, XRGB888, &LINEAR_MODIFIER, 1, GBM_BO_USE_RENDERING). This fails.

Expected. Our driver can't render to linear buffers in general. GBM isn't expressive enough to say exactly what sort of rendering will be performed, so we have to just assume one of the incompatible methods will be attempted.

If we change this to gbm_bo_create_with_modifiers2(800, 600, XRGB888, NULL, 0, GBM_BO_USE_RENDERING), it succeeds.

Again, expected. Since no modifier was requested, we'll choose on of our tiled layouts internally given the rendering usage.

But importing such a dmabuf via gbm_bo_import(GBM_BO_IMPORT_FD_MODIFIER) with the invalid modifier fails. Changing this to the explicit modifier chosen by your driver succeeds.

Of course. We have no way to know what modifier was used, so the layout is undefined and hence unsupported. If you figure out the modify and specify it, the layout is deterministic and can be supported.

However, even then, trying to gbm_bo_map(GBM_BO_TRANSFER_READ_WRITE) this buffer fails. Changing the original usage to GBM_BO_USE_WRITE makes it succeed.

Why would you expect to be able to perform an operation you didn't ask to be supported? Note we don't perform any tile/untile swizzling during CPU map operations, so you won't actually be able to use the mapping for much if it's a renderable buffer. This has been widely discussed, and is a purposeful divergence from the behavior of most, but not all, Mesa drivers, since it is an unreasonable burden to support.

Are these additional restrictions of your driver that are documented somewhere?

Is there a place you'd expect to find documentation describing how a driver behaves in undefined or loosely defined situations? GBM has no specification that I'm aware of, so there's no documented behavior to document divergence from. The implementation is the spec, and ours behaves how it behaves.

All of these issues seem like bugs or at least things where you significantly diverge from mesa.

None of these behaviors are likely to change. If you want more well-defined access to a GBM buffer from all vendors, import it into a well-defined API (OpenGL, Vulkan), and access it using that API. You'll also gain access to a superset of the operations available directly in GBM.

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 29, 2024

Expected. Our driver can't render to linear buffers in general. GBM isn't expressive enough to say exactly what sort of rendering will be performed, so we have to just assume one of the incompatible methods will be attempted.

What's the correct way to allocate buffers for cursor planes with your driver? Should I just omit all usages? I cannot use GBM_BO_USAGE_WRITE because that deoptimizes on mesa.

I would prefer it if you could allow RENDERING in GBM though. AFAIK, compositors call into GBM with the intersection of whatever their OpenGL/Vulkan API reports as renderable and what the DRM planes report as IN_FORMATs. And even if someone passed in LINEAR with RENDERING and you returned a buffer, the graphics API would still not allow them to import it for rendering.

Another perspective: What if I wanted to use VK_NV_linear_color_attachment to render to linear buffers? Is it not possible to allocate such buffers with your GBM backend?

Again, expected. Since no modifier was requested, we'll choose on of our tiled layouts internally given the rendering usage.

Of course. We have no way to know what modifier was used, so the layout is undefined and hence unsupported. If you figure out the modify and specify it, the layout is deterministic and can be supported.

The problem is that some mesa drivers also return explicit modifiers in this case just like nvidia. However, the very same drivers don't allow that modifier to be imported via their OpenGL/Vulkan APIs. Therefore I know of at least 3 compositors that explicitly overwrite such modifiers with the INVALID modifier which then allows successful import into OpenGL and Vulkan with these drivers (!).

This seems to be a fundamental incompatibility between the two implementations.

Why would you expect to be able to perform an operation you didn't ask to be supported?

I believe the GBM API only requires the WRITE usage for the gbm_bo_write. map OTOH does not talk about any usage requirements. I believe map and write are fundamentally different since map requires an explicit unmap step. Therefore it should be similar to transfering the texture to a staging buffer in ram at map time and transfering it back at unmap time. I assume that's how mesa implements it but I haven't checked.

To be clear: What are the exact usage requirements for gbm_bo_map in your driver?

Note we don't perform any tile/untile swizzling during CPU map operations, so you won't actually be able to use the mapping for much if it's a renderable buffer.

Isn't this just a violation of the GBM api? It says

 * The mapping exposes a linear view of the buffer object even if the buffer
 * has a non-linear modifier.

If this does not apply to all backends, then the documentation should definitely be adjusted.

Is there a place you'd expect to find documentation describing how a driver behaves in undefined or loosely defined situations? GBM has no specification that I'm aware of, so there's no documented behavior to document divergence from. The implementation is the spec, and ours behaves how it behaves.

Some kind of document describing how your driver diverges from mesa behavior (insofar such divergences are known) would be nice. Having to guess what EINVAL means without being able to check the source is burdensome.

If you want more well-defined access to a GBM buffer from all vendors, import it into a well-defined API (OpenGL, Vulkan).

Unfortunately in my integration tests I use software rendering because I encountered slight differences in blending behavior across different drivers. Enough to cause tests that compared screenshots to fail.

At the same time I have to use dmabufs because their usage is hardcoded all over the place.

@cubanismo
Copy link

What's the correct way to allocate buffers for cursor planes with your driver? Should I just omit all usages? I cannot use GBM_BO_USAGE_WRITE because that deoptimizes on mesa.

GBM_BO_USAGE_CURSOR. Note this will also get you a linear buffer, since our hardware doesn't support tiled images for cursor planes either.

I would prefer it if you could allow RENDERING in GBM though.

For Linear? The hardware doesn't support rendering to linear in various situations, so we don't advertise it. That would cause more problems than it would solve.

What if I wanted to use VK_NV_linear_color_attachment to render to linear buffers? Is it not possible to allocate such buffers with your GBM backend?

If you have to do it in GBM, just don't specify RENDERING usage. As I say, our implementation assumes that non-specific term implies rendering beyond the capabilities VK_NV_linear_color_attachment exposes. There's no way to fix that inconsistency without revving the GBM API.

The problem is...

I'm not saying the GBM API doesn't have problems. The whole GBM usage flag concept just doesn't work that well.

However, I looked at the implementation, and it should allow importing without specifying a modifier as long as the import operation uses the same parameters (width/height/format/usage) as the allocation, since it will derive the same modifier in each case. Is that what you're trying? If so, it should work, so there may be a bug.

I believe the GBM API only requires the WRITE usage for the gbm_bo_write. map OTOH does not talk about any usage requirements.

I misunderstood. All the same, our driver fails mapping of any non-linear buffer. GBM_BO_USE_WRITE usage is essentially a synonym for linear in our implementation.

Isn't this just a violation of the GBM api? It says...

As I say, this is deliberate. It's not going to change. Don't use gbm_bo_map(), and if you have to, use it on linear buffers.

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 30, 2024

GBM_BO_USAGE_CURSOR. Note this will also get you a linear buffer, since our hardware doesn't support tiled images for cursor planes either.

I remember something about USAGE_CURSOR being deprecated or at least no longer needed with modifiers.

However, I looked at the implementation, and it should allow importing without specifying a modifier as long as the import operation uses the same parameters (width/height/format/usage) as the allocation, since it will derive the same modifier in each case. Is that what you're trying? If so, it should work, so there may be a bug.

On the import side I am not specifying any usage since I'm only using it for mapping.

Anyway, I'll try to see if I can make the necessary changes. Thanks for your input.

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 30, 2024

@krakow10 a proper solution for this will probably involve writing a new allocator based on /dev/udmabuf for tests. I'll be out of the office for the next 4 weeks but I'll look into it afterwards. The screenshot case should already be solved in this branch. I have something locally for using non-linear modifiers for screensharing.

@cubanismo
Copy link

I remember something about USAGE_CURSOR being deprecated or at least no longer needed with modifiers.

All flags besides USEAGE_SCANOUT are invalid with modifiers. This suggestion would be for if you weren't using modifiers.

Unfortunately in my integration tests I use software rendering because I encountered slight differences in blending behavior across different drivers. Enough to cause tests that compared screenshots to fail.

Thinking about this, that shouldn't matter. All I was suggesting was using e.g., Vulkan's transfer functions to implement a read-modify-write operation, where your software rendering would be the "modify" portion of that operation in this case, instead of using GBM to map a tiled image for software rendering. If you needed more functionality, Vulkan has that too, but it sounds like that's not relevant in this case.

@mahkoh
Copy link
Owner Author

mahkoh commented Jul 31, 2024

All flags besides USEAGE_SCANOUT are invalid with modifiers.

What do you mean by this? Your driver does not seem to ignore other flags, because it still rejects RENDERING with the linear modifier. But is also doesn't outright reject RENDERING + valid modifier because I always use the RENDERING flag for framebuffers.

let mut usage = GBM_BO_USE_RENDERING | GBM_BO_USE_SCANOUT;

I believe that wlroots does the same.

It is possible, I suppose, that even in mesa these flags don't affect the allocated buffer and are only used for validation.

All I was suggesting was using e.g., Vulkan's transfer functions to implement a read-modify-write operation, where your software rendering would be the "modify" portion of that operation in this case

That's an interesting idea that I had not considered. That would have worked.

@mahkoh
Copy link
Owner Author

mahkoh commented Sep 4, 2024

Superseded by several PRs that have been merged into master.

  • Added a udmabuf allocator for tests
  • Added a vulkan allocator that allows mapping non-linear buffers on nvidia hardware
  • Removed the requirement that screenshots and screencasts use linear buffers

@mahkoh mahkoh closed this Sep 4, 2024
@cubanismo
Copy link

All flags besides USEAGE_SCANOUT are invalid with modifiers.

What do you mean by this? Your driver does not seem to ignore other flags, because it still rejects RENDERING with the linear modifier. But is also doesn't outright reject RENDERING + valid modifier because I always use the RENDERING flag for framebuffers.

For the record, I mispoke/remembered the design incorrectly here. Only the linear flag is explicitly invalid when mixed with modifiers. However, as you note we'll also reject any flags that imply linear on our architecture if combined with a non-linear modifier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants