Lower polygeist.subindex through memref.reinterpret_cast #147

mortbopet · 2022-01-07T10:21:28Z

This should be a (hopefully) foolproof method of performing indexing into a memref. A reintrepret_cast is inserted with a dynamic index calculated from the subindex index operand + the product of the sizes of the target type.

This has been added as a separate conversion pass instead of through the canonicalization drivers. When added as a canonicalization, the conversion may preemptively apply, resulting in sub-par IR. Nevertheless, i think it has its merits to have a polygeist op lowering pass which can be used as a fallback to convert the dialect operations, if canonicalization fails.

For now, just added support for statically shaped memrefs (enough to fix the regression on my side) but should be possible for dynamically shaped as well.

Sort of fixes #141

wsmoses · 2022-01-07T17:56:41Z

test/polygeist-opt/lower_polygeist_ops.mlir

+// CHECK:           %[[VAL_1:.*]] = memref.alloca() : memref<30x30xi32>
+// CHECK:           %[[VAL_2:.*]] = arith.constant 30 : index
+// CHECK:           %[[VAL_3:.*]] = arith.muli %[[VAL_0]], %[[VAL_2]] : index
+// CHECK:           %[[VAL_4:.*]] = memref.reinterpret_cast %[[VAL_1]] to offset: {{\[}}%[[VAL_3]]], sizes: [30], strides: [1] : memref<30x30xi32> to memref<30xi32>


I think you may need to extract the previous stride and then add this new value. Lest this not work if you have two such operations.

This should have been done in the updated commit (see the new test for reference). Let me know what you think!

Perhaps I'm not understanding, or I'm still not seeing it.

Can you add a subindex of subindex test (where both subindices are just offsets, rather than rank reducing?

Additionally, could you have the memref argument be passed in as an argument?

In other words I would've expected %[[VAL_3:.*]] to set the new offset = oldoffset + index * dimsize, whereas it is currently just index * dimsize

Can you add a subindex of subindex test (where both subindices are just offsets, rather than rank reducing?

I've added an extra test for the case of a non-rank reducing subindex operation. Again, this initial PR only covers the case of statically sized memrefs (which takes care of the most pressing issues on my sides). subindex to subindex of statically sized memories should therefore hold transitively.

Additionally, could you have the memref argument be passed in as an argument?
In other words I would've expected %[[VAL_3:.*]] to set the new offset = oldoffset + index * dimsize, whereas it is currently just index * dimsize

Not sure i understand what you mean by this. Could you show a snippet of the IR that you expect here?

I guess my worry is that you don't appear to be adding to the offset (e.g. you are setting the new offset). Suppose you have two subindex's in a row, the total offset should include terms from both subindex operations. Is that the case presently?

Unless i am misunderstanding the reinterpret_cast operation, the offset is included in the value that the reinterpret_cast returns. So when two subindex operations are in a row, the first one will be converted to a reinterpret cast that has the same semantics as the initial subindex operation. Lowering the second is therefore independent of this.

But again an IR example of expected behaviour here might clarify any confusion.

module attributes {llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { llvm.func @printf(!llvm.ptr<i8>, ...) -> i32 llvm.mlir.global internal constant @str0("data[%d][%d]=%d\0A\00") func @set(%arg0: memref<?xi32>) attributes {llvm.linkage = #llvm.linkage<external>} { %c3_i32 = arith.constant 3 : i32 affine.store %c3_i32, %arg0[0] : memref<?xi32> return } func @main() -> i32 attributes {llvm.linkage = #llvm.linkage<external>} { %c3_i32 = arith.constant 3 : i32 %c2 = arith.constant 2 : index %c1 = arith.constant 1 : index %c0_i32 = arith.constant 0 : i32 %c0 = arith.constant 0 : index %0 = llvm.mlir.undef : i32 %1 = memref.alloca() : memref<2x3xi32> affine.store %c0_i32, %1[0, 0] : memref<2x3xi32> affine.store %c0_i32, %1[0, 1] : memref<2x3xi32> affine.store %c0_i32, %1[0, 2] : memref<2x3xi32> affine.store %c0_i32, %1[1, 0] : memref<2x3xi32> affine.store %c0_i32, %1[1, 1] : memref<2x3xi32> affine.store %c0_i32, %1[1, 2] : memref<2x3xi32> %2 = "polygeist.subindex"(%1, %c1) : (memref<2x3xi32>, index) -> memref<?xi32> %3 = "polygeist.subindex"(%2, %c2) : (memref<?xi32>, index) -> memref<?xi32> affine.store %c3_i32, %3[0] : memref<?xi32> scf.for %arg0 = %c0 to %c2 step %c1 { %4 = arith.index_cast %arg0 : index to i32 scf.for %arg1 = %c0 to %c2 step %c1 { %5 = arith.index_cast %arg1 : index to i32 %6 = llvm.mlir.addressof @str0 : !llvm.ptr<array<17 x i8>> %7 = llvm.getelementptr %6[%c0_i32, %c0_i32] : (!llvm.ptr<array<17 x i8>>, i32, i32) -> !llvm.ptr<i8> %8 = memref.load %1[%arg0, %arg1] : memref<2x3xi32> %9 = llvm.call @printf(%7, %4, %5, %8) : (!llvm.ptr<i8>, i32, i32, i32) -> i32 } } return %0 : i32 } }

Such cases are not covered by the PR due to the dynamically sized memrefs.

Again, this PR intends to lower polygeist.subindex operations with static memrefs. By having this, even if it is only a subset of all possible polygeist.subindex operations that are converted, we still expand the set of C programs that Polygeist can emit using upstream MLIR dialect operations. It is vital that the Polygeist dialect operations are converted for any downstream tools to be able to consume the output IR.
I am not excluding that I'll in the future look more closely into how polygeist.subindex operations with dynamically shaped memrefs can be lowered into something meaningful, but for now, our usecase requires statically shaped memrefs and as such that is the most pressing issue to have resolved in Polygeist.

Oh apologies, let me make that a statically sized subindex in that case (also regardless running the pass on the above crashes, which shuoldn't happen...)

Regardless, my concern here is (perhaps because of unfamiliarity with reinterpret_cast) that one of the offsets will be lost.

module attributes {llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { llvm.func @printf(!llvm.ptr<i8>, ...) -> i32 llvm.mlir.global internal constant @str0("data[%d][%d]=%d\0A\00") func @set(%arg0: memref<?xi32>) attributes {llvm.linkage = #llvm.linkage<external>} { %c3_i32 = arith.constant 3 : i32 affine.store %c3_i32, %arg0[0] : memref<?xi32> return } func @main() -> i32 attributes {llvm.linkage = #llvm.linkage<external>} { %c3_i32 = arith.constant 3 : i32 %c2 = arith.constant 2 : index %c1 = arith.constant 1 : index %c0_i32 = arith.constant 0 : i32 %c0 = arith.constant 0 : index %0 = llvm.mlir.undef : i32 %1 = memref.alloca() : memref<2x3xi32> affine.store %c0_i32, %1[0, 0] : memref<2x3xi32> affine.store %c0_i32, %1[0, 1] : memref<2x3xi32> affine.store %c0_i32, %1[0, 2] : memref<2x3xi32> affine.store %c0_i32, %1[1, 0] : memref<2x3xi32> affine.store %c0_i32, %1[1, 1] : memref<2x3xi32> affine.store %c0_i32, %1[1, 2] : memref<2x3xi32> %2 = "polygeist.subindex"(%1, %c1) : (memref<2x3xi32>, index) -> memref<3xi32> %3 = "polygeist.subindex"(%2, %c2) : (memref<3xi32>, index) -> memref<?xi32> affine.store %c3_i32, %3[0] : memref<?xi32> scf.for %arg0 = %c0 to %c2 step %c1 { %4 = arith.index_cast %arg0 : index to i32 scf.for %arg1 = %c0 to %c2 step %c1 { %5 = arith.index_cast %arg1 : index to i32 %6 = llvm.mlir.addressof @str0 : !llvm.ptr<array<17 x i8>> %7 = llvm.getelementptr %6[%c0_i32, %c0_i32] : (!llvm.ptr<array<17 x i8>>, i32, i32) -> !llvm.ptr<i8> %8 = memref.load %1[%arg0, %arg1] : memref<2x3xi32> %9 = llvm.call @printf(%7, %4, %5, %8) : (!llvm.ptr<i8>, i32, i32, i32) -> i32 } } return %0 : i32 } }

mortbopet · 2022-02-21T12:09:13Z

@wsmoses mind taking another look at this?

lib/polygeist/Passes/LowerPolygeistOps.cpp

This should be a (hopefully) foolproof method of performing indexing into a memref. A reintrepret_cast is inserted with a dynamic index calculated from the subindex index operand + the product of the sizes of the target type. This has been added as a separate conversion pass instead of through the canonicalization drivers. When added as a canonicalization, the conversion may preemptively apply, resulting in sub-par IR. Nevertheless, i think it has its merits to have a polygeist op lowering pass which can be used as a fallback to convert the dialect operations, if canonicalization fails. For now, just added support for statically shaped memrefs (enough to fix the regression on my side) but should be possible for dynamically shaped as well.

mortbopet marked this pull request as draft January 7, 2022 10:27

mortbopet force-pushed the subindex_lowering2 branch from 092c1ad to 3e0cdb7 Compare January 7, 2022 10:48

wsmoses reviewed Jan 7, 2022

View reviewed changes

mortbopet force-pushed the subindex_lowering2 branch from 3e0cdb7 to 120bc76 Compare January 18, 2022 18:02

mortbopet force-pushed the subindex_lowering2 branch 2 times, most recently from afc0b95 to ecbd27b Compare January 29, 2022 08:54

mortbopet marked this pull request as ready for review January 29, 2022 08:54

mortbopet requested a review from wsmoses January 29, 2022 08:54

mortbopet force-pushed the subindex_lowering2 branch from ecbd27b to dab65e8 Compare February 21, 2022 12:15

wsmoses reviewed Feb 22, 2022

View reviewed changes

lib/polygeist/Passes/LowerPolygeistOps.cpp Outdated Show resolved Hide resolved

mortbopet added 6 commits May 24, 2022 11:01

Extend polygeist subindex lowering to multidim memrefs

0769f00

rebase

8c565b2

Rename pass

c070c5e

Also handle non-rank reducing offset operations

52f6117

rebase to polygeist upstream

524ccc8

mortbopet force-pushed the subindex_lowering2 branch from ff96f5c to 524ccc8 Compare May 24, 2022 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lower polygeist.subindex through memref.reinterpret_cast #147

Lower polygeist.subindex through memref.reinterpret_cast #147

mortbopet commented Jan 7, 2022 •

edited

Loading

wsmoses Jan 7, 2022

mortbopet Feb 16, 2022

wsmoses Feb 22, 2022

wsmoses Feb 22, 2022

mortbopet Feb 23, 2022

wsmoses Feb 23, 2022

mortbopet Feb 23, 2022

wsmoses Feb 24, 2022

mortbopet Feb 24, 2022

wsmoses Feb 24, 2022

mortbopet commented Feb 21, 2022

Lower polygeist.subindex through memref.reinterpret_cast #147

Are you sure you want to change the base?

Lower polygeist.subindex through memref.reinterpret_cast #147

Conversation

mortbopet commented Jan 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mortbopet commented Feb 21, 2022

mortbopet commented Jan 7, 2022 •

edited

Loading