Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Bindless] Fix incorrect mangling of bindless images builtin functions #16135

Open
wants to merge 1 commit into
base: sycl
Choose a base branch
from

Conversation

DBDuncan
Copy link
Contributor

@DBDuncan DBDuncan commented Nov 20, 2024

The mipmap builtin functions were incorrectly defined in the CUDA backend as having dX and dY argument of 4 elements when 3 is expected. This flew under the radar until a recent pull from upstream LLVM. In addition, an incorrect 'fetch_vec_size' type had been fixed when generating thunk mipmap builtins. 'mipmap_read_3D.cpp' test re-enable.

Fixes #15727

…nctions

The mipmap builtin functions were incorrectly defined in the CUDA backend as having dX and dY argument of 4 elements instead of 3. This flew under the radar until a recent pull from upsteam LLVM. In addition, an incorrect 'fetch_vec_size' type had been fixed when generating thunk mipmap builtins. Renable 'mipmap_read_3D.cpp' test as well.
@frasercrmck
Copy link
Contributor

Could you explain the problem in more detail? I assume that it's a problem with the __spirv_ImageSampleExplicitLod* builtin declarations? I was checking these against the SPIR-V specification, but I found that OpImageSampleExplicitLod says that the coordinate operands can be of a larger vector type:

Coordinate must be a scalar or vector of floating-point typeor integer type.
It contains (u[, v] …​ [, array layer]) as needed by the definition of Sampled Image.
Unless the Kernel capability is declared, it must be floating point. It may be a vector
larger than needed, but all unused components appear after all used components.

So it doesn't sound like invalid SPIR-V, per se. So it's more that it's a breach of contract between two parts of the code base - libclc and something else?

The PR description should also say that it fixes #15727.

@DBDuncan
Copy link
Contributor Author

DBDuncan commented Nov 20, 2024

Hey @frasercrmck . There was a change quite a while ago to update the type used to pass coords for 3d images from a vec of size 4 to a vec of size 3. (I can't remember why it was 4 in the first place. The last element was never used.) Problem is that the templating in sample_mipmap itself uses CoordT for dX and dY. Which is fine but the mipmap builtins expected dx and dy to be vecs of 4 elements and not 3 for 3D mipmap images.

This mismatch did not cause any problems until a pull down from upstream llvm a while ago. So this updates the builtin mangling to fix this. Also, the changes here only relate the CUDA backend. Intel devices which do use SPIR-V and L0 do not support mipmaps currently as they are not supported in L0.

There is a funny dichotomy going on with libspirv being used to generate ptx. So I think it would be strange for the SPIR-V spec to matter here? I am not sure about how exactly the ptx instruction pipeline works here.

Oh and good shout to link to the issue this fixes!

@frasercrmck
Copy link
Contributor

Thanks for explaining the issue.

There is a funny dichotomy going on with libspirv being used to generate ptx. So I think it would be strange for the SPIR-V spec to matter here? I am not sure about how exactly the ptx instruction pipeline works here.

I believe the idea is that the SYCL runtime calls functions that match the SPIR-V friendly IR used by the LLVM -> SPIR-V translator. That works for targets that use SPIR-V, obviously. But then targets that don't, like CUDA and HIP, implement those same builtin interfaces in LLVM IR.

So I think using a 4-element coordinate vector is "fine" in terms of SPIR-V and the SPIR-V targets, just as long as everyone agrees on the ABI. I suppose libclc could provide both 3- and 4-element vector coordinate versions of all these builtins, but it's hard to argue for extra bloat if we're not using those functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CoordT type deduction issue in __invoke__ImageReadSampler
3 participants