Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Add pattern to fuse tensor.collapse_shape into forall producer #19295

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Max191
Copy link
Contributor

@Max191 Max191 commented Nov 25, 2024

This PR adds a pattern to fuse a consumer tensor.collapse_shape into a producer scf.forall op. The transform is added to FuseAndHoistParallelLoops, where it helps to fuse tensor.unpack ops with extract_slice semantics into producer loops. This is needed when targeting MFMA intrinsics for unaligned shapes, and also in generating code for unset encoding ops on GPU.

The PR also adds a transform op to keep the long lit tests separate from the FuseAndHoistParallelLoop tests.

@Max191
Copy link
Contributor Author

Max191 commented Nov 25, 2024

PR is mostly ready, but I need to add more lit tests and improve the docs.

EDIT: Ready now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant