Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Add pattern to fuse tensor.extract_slice into forall producer #19296

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Max191
Copy link
Contributor

@Max191 Max191 commented Nov 25, 2024

This PR adds a pattern to fuse a consumer tensor.extract_slice into a producer scf.forall op. The transform is added to FuseAndHoistParallelLoops, where it helps to fuse tensor.unpack ops with extract_slice semantics into producer loops. This is needed when targeting MFMA intrinsics for unaligned shapes, and also in generating code for unset encoding ops on GPU. This is a follow up to #19295, which has the complementing pattern for collapse_shape.

The PR also adds a transform op to keep the long lit tests separate from the FuseAndHoistParallelLoop tests.

Depends on #19295

@Max191 Max191 force-pushed the extract-slice-forall-fusion branch 2 times, most recently from 2f5056e to 8f9be22 Compare November 26, 2024 18:30
@Max191 Max191 marked this pull request as ready for review November 26, 2024 18:31
@Max191
Copy link
Contributor Author

Max191 commented Nov 26, 2024

This is based on #19295. Please only review the last commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant