Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move StMatrix and TMA Store swizzle schedule functions to mma_utils #3552

Merged
merged 8 commits into from
Dec 10, 2024

Conversation

rdspring1
Copy link
Collaborator

@rdspring1 rdspring1 commented Dec 10, 2024

This PR is stack on #3553.

Changes

  1. Moved analyzeSwizzleSharedMemory, tmaSwizzleSharedMemory, scheduleStMatrixForMmaOutput, and scheduleTMAStoreForMmaOutput to mma_utils.
  2. Deleted swizzleSharedMemory from hopper matmul scheduler.
  3. Updated tests to use new mma_utils.

@rdspring1 rdspring1 force-pushed the hopper_matmul_stsm_refactor branch from f34b697 to 20a4b5e Compare December 10, 2024 05:25
@rdspring1 rdspring1 changed the base branch from main to hopper_matmul_stsm_refactor_pt1 December 10, 2024 05:25
@rdspring1 rdspring1 changed the title [Tracking] Move Hopper Matmul Scheduler to mma_utils Move StMatrix helper functions to mma_utils Dec 10, 2024
@rdspring1 rdspring1 changed the title Move StMatrix helper functions to mma_utils Move StMatrix and TMA Store helper functions to mma_utils Dec 10, 2024
@rdspring1 rdspring1 changed the title Move StMatrix and TMA Store helper functions to mma_utils Move StMatrix and TMA Store swizzle schedule functions to mma_utils Dec 10, 2024
@rdspring1 rdspring1 marked this pull request as ready for review December 10, 2024 05:29
rdspring1 added a commit that referenced this pull request Dec 10, 2024
This PR contains the actual code changes from
#3552.

1. Fix `hardCodedIndexGenerationForStMatrixSwizzle` for stmatrix (16, 8). The lane column is always `lane_id / 16` not `lane_id / stsm_n_tile`.
2. Fix `tmaSwizzleSharedMemory` by ensuring box inner dimension is evenly divisible by swizzle size.
3. Clean `scheduleTMAStoreForMmaOutput` and `scheduleStMatrixForMmaOutput` by remove unnecessary scheduling and correcting `num_ids_to_skip`.
4. Replace `mma_macro_to_str_map` with `macroToString` function.
Base automatically changed from hopper_matmul_stsm_refactor_pt1 to main December 10, 2024 16:08
@rdspring1
Copy link
Collaborator Author

!test

@rdspring1 rdspring1 merged commit 456b319 into main Dec 10, 2024
48 checks passed
@rdspring1 rdspring1 deleted the hopper_matmul_stsm_refactor branch December 10, 2024 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants