Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make storage class of builtins to be Input #2456

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

asudarsa
Copy link
Contributor

@asudarsa asudarsa commented Mar 22, 2024

In case of SPIRV builtin global variables, the storage class function should be StorageClassInput, irrespective of the incoming addrspace.
Ref: https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Env.html
Sec 2.9. Built-in variables.

Forcing the storage class like this results in illegal casts between SPIRV builtin global variables and their users. To avoid this, the global variables are copied over to local memory during the SPIRVRegularizeLLVM pass.
SPIRV Reader has also been modified to react to these changes appropriately.

Thanks
Sincerely

@asudarsa
Copy link
Contributor Author

All the test fails in the CI report this error:
error: line XX: Expected input to have storage class Workgroup, CrossWorkgroup or Function: PtrCastToGeneric
%37 = OpPtrCastToGeneric %_ptr_Generic_uint %__spirv_BuiltInSubgroupMaxSize

This translation is caused by the LLVM IR in the test:
%2 = addrspacecast ptr addrspace(1) @__spirv_BuiltInSubgroupMaxSize to ptr addrspace(4)

This rule contradicts the rule in OpenCL 3.0 SPIR-V env spec shown in the description.
I see that there is a SPIR-V Registry issue that might be related to this.
KhronosGroup/SPIRV-Registry#73

Is it possible to allow Input storage class for OpPtrCastToGeneric?

Thanks
Sincerely

@karolherbst
Copy link
Contributor

All the test fails in the CI report this error: error: line XX: Expected input to have storage class Workgroup, CrossWorkgroup or Function: PtrCastToGeneric %37 = OpPtrCastToGeneric %_ptr_Generic_uint %__spirv_BuiltInSubgroupMaxSize

This translation is caused by the LLVM IR in the test: %2 = addrspacecast ptr addrspace(1) @__spirv_BuiltInSubgroupMaxSize to ptr addrspace(4)

This rule contradicts the rule in OpenCL 3.0 SPIR-V env spec shown in the description. I see that there is a SPIR-V Registry issue that might be related to this. KhronosGroup/SPIRV-Registry#73

Is it possible to allow Input storage class for OpPtrCastToGeneric?

I think it would be better to handle this case on the llvm IR level and copy the value to function memory first. Adding fundamental new address spaces to generic is a problem for some hardware, e.g. Nvidia and AMD where they have native support for generic pointers, but only if it's CL global, local, private memory as they can map their local and private memory into the global address space.

If generic pointers get fundamental new address spaces it requires different kind of lowering and more runtime overhead.

Input is also more of a "virtual" address space anyway and it's very hardware specific if the value resides in some accessible memory (e.g. a constant buffer) or if the hardware pulls it via "system value" instructions. In any case, adding Input to generic would make it very painful for runtime compilers to handle in an efficient manner.

@asudarsa
Copy link
Contributor Author

All the test fails in the CI report this error: error: line XX: Expected input to have storage class Workgroup, CrossWorkgroup or Function: PtrCastToGeneric %37 = OpPtrCastToGeneric %_ptr_Generic_uint %__spirv_BuiltInSubgroupMaxSize
This translation is caused by the LLVM IR in the test: %2 = addrspacecast ptr addrspace(1) @__spirv_BuiltInSubgroupMaxSize to ptr addrspace(4)
This rule contradicts the rule in OpenCL 3.0 SPIR-V env spec shown in the description. I see that there is a SPIR-V Registry issue that might be related to this. KhronosGroup/SPIRV-Registry#73
Is it possible to allow Input storage class for OpPtrCastToGeneric?

I think it would be better to handle this case on the llvm IR level and copy the value to function memory first. Adding fundamental new address spaces to generic is a problem for some hardware, e.g. Nvidia and AMD where they have native support for generic pointers, but only if it's CL global, local, private memory as they can map their local and private memory into the global address space.

If generic pointers get fundamental new address spaces it requires different kind of lowering and more runtime overhead.

Input is also more of a "virtual" address space anyway and it's very hardware specific if the value resides in some accessible memory (e.g. a constant buffer) or if the hardware pulls it via "system value" instructions. In any case, adding Input to generic would make it very painful for runtime compilers to handle in an efficient manner.

Thanks @karolherbst
I will add a change to copy this over to function memory.

Comment on lines 375 to 376
// 1. Value associated with the type is a builtin variable.
// 2. Value associated with the type is a GEPInst with address pointer as a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are both these cases hit by testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After recent changes, we do not need to worry about the second case. The first one is covered.

Thanks

@asudarsa asudarsa marked this pull request as draft April 1, 2024 21:23
@asudarsa
Copy link
Contributor Author

asudarsa commented Apr 1, 2024

I am marking this as draft this I resolve the issue with replacing uses of global variable.

Thanks

@asudarsa asudarsa marked this pull request as ready for review April 2, 2024 19:05
PassMgr.addPass(SPIRVLowerConstExprPass());
PassMgr.addPass(SPIRVRegularizeLLVMPass());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SPIRVRegularizeLLVMPass, we have some logic where we try to replace a SPIRV Builtin global constant with a variable in local memory. It helps to have the LowerConstExpr pass run before the regularize pass .

Thanks

@asudarsa
Copy link
Contributor Author

asudarsa commented Apr 4, 2024

Hi @karolherbst, @MrSidims and @LU-JOHN

Can you please take a look at this change when convenient?

Thanks

; RUN: llvm-spirv %t.bc -spirv-text -o - | FileCheck --check-prefix=CHECK-SPIRV %s
; RUN: llvm-spirv %t.bc -o %t.spv
; RUN: spirv-val %t.spv

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check reverse translation to exercise the changes in SPIRVReader.cpp?

to react to changes made in forward translation.

Signed-off-by: Sudarsanam, Arvind <[email protected]>
Signed-off-by: Sudarsanam, Arvind <[email protected]>
Signed-off-by: Sudarsanam, Arvind <[email protected]>
Signed-off-by: Sudarsanam, Arvind <[email protected]>
Signed-off-by: Sudarsanam, Arvind <[email protected]>
auto *StorePtr = SI->getPointerOperand();
if (StorePtr->hasName() &&
StorePtr->getName().starts_with("__local_SPIRV_Builtin")) {
if (auto *LI = dyn_cast<LoadInst>(SI->getValueOperand())) {
Copy link
Contributor

@LU-JOHN LU-JOHN Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these if's (at line 1765 and below) necessary? Is it possible that some __local_SPIRV_Builtin's will not be removed in reverse translation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. They can be turned to 'cast' statements and then asserts to catch any unexpected behavior.

Thanks

/// // Replace uses of __spirv_BuiltInWorkgroupId with %__local_SPIRV_Builtin*
/// ret void
/// }
void SPIRVRegularizeLLVMBase::copyBuiltinGVToLocalVar(Module *M) {
Copy link
Contributor

@MrSidims MrSidims Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also cases, when get_..._id() builtin is represented via function call (both OpenCL and SPIR-V friendly). Are these cases being handled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I ran into a LIT test fail due to this very same issue and I resolved it.

// (passed via V here). In this case, we also need a AddrSpaceCast to cast
// between the new mapped value and the old.
if (Loc->second->hasName() &&
Loc->second->getName().starts_with("__local_SPIRV_Builtin")) {
Copy link
Contributor

@MrSidims MrSidims Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If reader counterpart is not done, are functional regressions expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so...I can try it out in a separate PR.

/// addrspace(1) constant <3 x i64>, align 32
/// define weak_odr dso_local spir_kernel void @foo() {
/// ...
/// %1 = load <3 x i64>, ptr addrspace(1) @__spirv_BuiltInWorkgroupId, align 32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not tested

/// %1 = load <3 x i64>, ptr addrspace(1) @__spirv_BuiltInWorkgroupId, align 32
/// %__local_SPIRV_Builtin* = alloca <3 x i64> addrspace(0), align 32
/// store <3 x i64> %1, ptr addrspace(0) %__local_SPIRV_Builtin*, align 32
/// // Replace uses of __spirv_BuiltInWorkgroupId with %__local_SPIRV_Builtin*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are expected users of __spirv_BuiltInWorkgroupId? Do we need to create alloca and why we can't just use result of the load?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to replace one pointer with another.

/// ...
/// %1 = load <3 x i64>, ptr addrspace(1) @__spirv_BuiltInWorkgroupId, align 32
/// %__local_SPIRV_Builtin* = alloca <3 x i64> addrspace(0), align 32
/// store <3 x i64> %1, ptr addrspace(0) %__local_SPIRV_Builtin*, align 32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be generic address space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the plan here was to copy it to a local variable.

/// ret void
/// }
void SPIRVRegularizeLLVMBase::copyBuiltinGVToLocalVar(Module *M) {
for (auto I = M->begin(), E = M->end(); I != E;) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality is not tested

// https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Env.html
// Sec 2.9. Built-in variables.
// Do not store the translated type into PointeeTypeMap for IsBuiltin = true.
SPIRVType *LLVMToSPIRVBase::transPointerType(Type *ET, unsigned AddrSpc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding extra parameter to the API, can we figure out something like:
IF translate builtin GV
THEN don't call transType, but create Type
?

/// define weak_odr dso_local spir_kernel void @foo() {
/// ...
/// %1 = load <3 x i64>, ptr addrspace(1) @__spirv_BuiltInWorkgroupId, align 32
/// %__local_SPIRV_Builtin* = alloca <3 x i64> addrspace(0), align 32
Copy link
Contributor

@MrSidims MrSidims Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, while there were some recent improvements in instruction ordering in the generated module, still, can we ensure, that OpVariable is placed in the very beginning of the first basic block? Otherwise this validation rule is being violated:
All OpVariable instructions in a function must be in the first block in the function. These instructions, together with any intermixed OpLine and OpNoLine instructions, must be the first instructions in that block. (Note the validation rules prevent OpPhi instructions in the first block of a function.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not explicitly control where the OpVariable defined. in all my testing, I do see it occurring at the top.

auto *LoadedVal = Builder.CreateLoad(LoadTy, GV);
if (GV->getAlignment())
LoadedVal->setAlignment(llvm::Align(GV->getAlignment()));
auto *NewVar = Builder.CreateAlloca(GV->getType(), nullptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be having several builtins accessed in the kernel (global id, local id, size etc), if I read the code correctly it would result in several __local_SPIRV_Builtin declarations thus SSA violation. Regardless how it's addressed, please reflect it in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will create them with suffixes.

@@ -670,7 +681,8 @@ SPIRVType *LLVMToSPIRVBase::transPointerType(Type *ET, unsigned AddrSpc) {

auto SaveType = [&](SPIRVType *MappedTy) {
OpaqueStructMap[Key] = MappedTy;
PointeeTypeMap[TypeKey] = MappedTy;
if (!IsBuiltin)
PointeeTypeMap[TypeKey] = MappedTy;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean, that for every builtin GV translated we won't visit the map, but instead would translate a type from scratch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately ... yes.

@asudarsa asudarsa marked this pull request as draft May 6, 2024 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants