Add MatmulParams::cluster_dims parameter #3574

jacobhinkle · 2024-12-11T16:12:06Z

Following #3557 we can specify the cluster size for our fusions. Currently we don't do anything explicitly with CGAs, but this can help guarantee that tiles are scheduled onto GPCs in pairs. Each GPC has a number of TPCs, each of which holds 2 SMs, so this lets us take advantage of caching at the TPC and GPC level for operand loads, in addition to L2.

This PR enables this with a default size of {2, 1, 1} for the Hopper scheduler. The parameter is ignored in the Ampere scheduler.

It is not yet plumbed into the heuristic plugin API yet. I thought maybe we should wait until we have more parameters related to CGAs to do that.

Following #3557 we can specify the cluster size for our fusions. Currently we don't do anything explicitly with CGAs, but this can help guarantee that tiles are scheduled onto GPCs in pairs. Each GPC has a number of TPCs, each of which holds 2 SMs, so this lets us take advantage of caching at the TPC and GPC level for operand loads, in addition to L2.

jacobhinkle · 2024-12-11T16:12:53Z

!test

This reverts commit e3f611c.

jacobhinkle · 2024-12-11T17:34:36Z

!test

rdspring1 · 2024-12-11T17:38:48Z

tests/cpp/test_matmul.cpp

@@ -3663,7 +3663,7 @@ TEST_F(HopperMatmulTest, HSH_NT_128BSwizzle) {
  const int64_t cta_m = 2 * getM(macro);
  const int64_t cta_n = 1 * getN(macro);

-  constexpr std::tuple<int64_t, int64_t, int64_t> cluster_dims{2, 1, 1};
+  constexpr std::tuple<int, int, int> cluster_dims{2, 1, 1};


super nitpick: can we stick with int64_t for consistency?

Suggested change

constexpr std::tuple<int, int, int> cluster_dims{2, 1, 1};

constexpr std::tuple<int64_t, int64_t, int64_t> cluster_dims{2, 1, 1};

Yeah, I was mostly doing that because the MatmulParams entries are int, but we should probably just change MatmulParams instead (in another PR).

Done. Much smaller PR now...

jacobhinkle · 2024-12-11T17:57:36Z

!build

jacobhinkle added 2 commits December 11, 2024 10:54

Move setCGADims to MultiMatmulScheduler, enable on Ampere

e3f611c

jacobhinkle requested review from rdspring1 and zasdfgbnm December 11, 2024 16:13

Revert "Move setCGADims to MultiMatmulScheduler, enable on Ampere"

193979a

This reverts commit e3f611c.

rdspring1 approved these changes Dec 11, 2024

View reviewed changes

jacobhinkle added the Matmuls label Dec 11, 2024

Switch from int->int64_t

6876bce

jacobhinkle marked this pull request as ready for review December 11, 2024 18:04

jacobhinkle merged commit 4382f28 into main Dec 11, 2024
17 checks passed

jacobhinkle deleted the cga_param branch December 11, 2024 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MatmulParams::cluster_dims parameter #3574

Add MatmulParams::cluster_dims parameter #3574

jacobhinkle commented Dec 11, 2024 •

edited

Loading

jacobhinkle commented Dec 11, 2024

jacobhinkle commented Dec 11, 2024

rdspring1 Dec 11, 2024

jacobhinkle Dec 11, 2024

jacobhinkle Dec 11, 2024

jacobhinkle commented Dec 11, 2024

	constexpr std::tuple<int, int, int> cluster_dims{2, 1, 1};
	constexpr std::tuple<int64_t, int64_t, int64_t> cluster_dims{2, 1, 1};

Add MatmulParams::cluster_dims parameter #3574

Add MatmulParams::cluster_dims parameter #3574

Conversation

jacobhinkle commented Dec 11, 2024 • edited Loading

jacobhinkle commented Dec 11, 2024

jacobhinkle commented Dec 11, 2024

rdspring1 Dec 11, 2024

Choose a reason for hiding this comment

jacobhinkle Dec 11, 2024

Choose a reason for hiding this comment

jacobhinkle Dec 11, 2024

Choose a reason for hiding this comment

jacobhinkle commented Dec 11, 2024

jacobhinkle commented Dec 11, 2024 •

edited

Loading