cutlass v3.1.0 #9

regro-cf-autotick-bot · 2023-05-24T21:34:05Z

It is very likely that the current package version for this feedstock is out of date.

Checklist before merging this PR:

Dependencies have been updated if changed: see upstream
Tests have passed
Updated license if changed and license_file is packaged

Information about this PR:

Feel free to push to the bot's branch to update this PR if needed.
The bot will almost always only open one PR per version.
The bot will stop issuing PRs if more than 3 version bump PRs generated by the bot are open. If you don't want to package a particular version please close the PR.
If you want these PRs to be merged automatically, make an issue with @conda-forge-admin,please add bot automerge in the title and merge the resulting PR. This command will add our bot automerge feature to your feedstock.
If this PR was opened in error or needs to be updated please add the bot-rerun label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase @conda-forge-admin, please rerun bot in a PR comment to have the conda-forge-admin add it for you.

Closes: #6
Closes: #7
Closes: #10
Closes: #11

Dependency Analysis

We couldn't run dependency analysis due to an internal error in the bot. :/ Help is very welcome!

_{This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by https://github.com/regro/cf-scripts/actions/runs/5073530612, please use this URL for debugging.}

conda-forge-webservices · 2023-05-24T21:34:10Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

h-vetinari · 2023-07-31T03:31:30Z

@conda-forge-admin, please rerender

h-vetinari · 2023-07-31T04:04:10Z

Ah:

CMake Warning at CMakeLists.txt:47 (message):
  CUTLASS 3.1.0 requires CUDA 11.4 or higher, and strongly recommends CUDA
  11.8 or higher.

Not sure this is going to work with CUDA 11.

…nda-forge-pinning 2023.07.30.16.32.56

h-vetinari · 2023-07-31T10:04:22Z

@jakirkham @leofang @ngam @hmaarrfk
This seems to be working in principle, but after 5h, we're still barely 50% through the compilation. I guess this is blowing up due to extra architectures compared to the 11.2 builds

-- CUDA Compilation Architectures: 70;72;75;80;86;87;89;90;90a

though it could of course also be related to the version bump. Would it make sense to trim this list a bit? (how?)

If that's not possible - would someone have time / resources to build this locally?

ngam · 2023-07-31T10:51:51Z

I’d say trimming the arches is okay, but I am not sure if people/developers use this package in ways more specialized. Having said that, this is already a pretty restrictive arch list…last time I thought about this, I settled on 60,70,75,80,86, 89,90,90a for cuda 12 (at least that’s what I will propose for conda-forge to use with jaxlib, tensorflow, and pytorch)

By the way, the build progress with these cuda-enabled packages can be misleading (usually not in our favor).

I will look into this soon if no response from more involved developers/maintainers 😺

hmaarrfk · 2023-07-31T11:18:47Z

I was personally thinking of reviving a 1060 GPU of mine as a benchmarking compute.r I guess that is a bad idea.

We kinda went through the architecture trimming before, and found that we would have to trim way too much for it to be acceptable.

Maybe we should start building packages on a per GPU generation level?

h-vetinari · 2023-07-31T11:28:27Z

Maybe we should start building packages on a per GPU generation level?

Yeah but AFAIK we have no metadata to select the right generation through virtual packages or similar. We only have the driver version, but that doesn't map 1:1 to architectures.

hmaarrfk · 2023-07-31T11:33:10Z

we have no metadata

I mean, i feel like as more packages start to take longer and longer on CIs, we have to find an other solution. I'm also not a fan of the fat binaries, they are just slow to download....

h-vetinari · 2023-07-31T11:52:09Z

I agree with you, but I'm asking how you imagine that would work? Perhaps conda needs a virtual package for the cuda arch of the machine?

hmaarrfk · 2023-07-31T11:55:27Z

virtual package for the cuda arch of the machine

yeah. but you would have to make the assumption that there is only one kind of GPU. I feel like this is a "fair assumption". Mixing GPUs on one machine seems like a bad idea....

hmaarrfk · 2023-07-31T11:59:09Z

in that same vain, i'm not sure how we are in targetting newer CPU x86-64 architectures. I feel like we target quite old instruction sets and might benefit from a bump there too. Again, this may blow up the build matrix.

ngam · 2023-07-31T12:22:31Z

Re trimming… Yep, it was me instigating a fight on that front a while back and we essentially discovered we likely needed to go down the single arch route.

I can see us potentially supporting ensembles of these arches. The problem is, we likely need more insight from practitioners for this to practical and usedul. When I was heavily involved in this a year ago, I privately started building for a single CPU (say epyc xyz with all its recommended flags) and target only A100 GPUs (with all its recommended flags). I soon discovered that the HPCs I have access to had a stupid setup… compute nodes had A100s, so-called viz nodes had V100s, and yet still some miscellaneous nodes had K80s. You get the idea of how this can be problematic at least on one (small-ish) side of use cases. The other thing is, when compiling and training models, we have to be careful about interchangeability. Maybe the new keras paradigm can help with that, but likely not…

ngam · 2023-07-31T12:26:47Z

There might be a way to optimize the compilation across arches (caching and reusing stuff, etc.) but I don’t know enough

ngam · 2023-07-31T12:32:07Z

Anyway for this particular one, let’s figure our appropriately setting the arches and have someone build it. It’s only one binary, so not the worst… I guess I really should get my stupid singularity PR submitted again to get this going as we get more packages done for 12

h-vetinari · 2023-07-31T12:35:57Z

I feel like we target quite old instruction sets and might benefit from a bump there too. Again, this may blow up the build matrix.

Yeah, this has been long overdue, but has stalled for a long time. Though there has been movement recently: conda/ceps#59

h-vetinari · 2023-07-31T12:37:33Z

It’s only one binary, so not the worst…

Yeah, with this feedstock it's really just the wait for the compilation (🤞)

hmaarrfk · 2023-07-31T13:37:02Z

log.txt

jakirkham · 2023-07-31T18:24:34Z

Thanks all! 🙏

Perhaps conda needs a virtual package for the cuda arch of the machine?

Would it make sense to convert this into a Conda issue for further discussion?

hmaarrfk · 2023-07-31T18:30:58Z

I feel like I don't have a strong ask yet to keep the discussion focused

jakirkham · 2023-07-31T18:38:21Z

Think that is ok. There's value in tracking the general need. Plus we can refine the ask into actionable steps through discussion

h-vetinari · 2023-07-31T21:21:31Z

Thanks a lot @hmaarrfk!

hmaarrfk · 2023-07-31T22:34:07Z

So a threadripper 2950 isn't the best processor, but it still shows about 14 hours of CPU time....

h-vetinari · 2023-07-31T22:57:24Z

Yeah, it seems that cutlass is a big baby...

updated v3.1.0

dc8a645

regro-cf-autotick-bot requested a review from h-vetinari as a code owner May 24, 2023 21:34

h-vetinari mentioned this pull request Jul 31, 2023

cutlass 3.0 #11

Closed

Rebuild for CUDA 12

0156091

h-vetinari force-pushed the 3.1.0_ha6eafc branch from c6ecb04 to c10c495 Compare July 31, 2023 04:22

h-vetinari added 3 commits July 31, 2023 16:06

adapt skip

1ec2e00

MNT: Re-rendered with conda-build 3.26.0, conda-smithy 3.24.1, and co…

e38e114

…nda-forge-pinning 2023.07.30.16.32.56

add libcublas

62477f1

h-vetinari force-pushed the 3.1.0_ha6eafc branch from df14740 to 62477f1 Compare July 31, 2023 05:06

hmaarrfk merged commit 46eca4f into conda-forge:main Jul 31, 2023

regro-cf-autotick-bot deleted the 3.1.0_ha6eafc branch July 31, 2023 13:39

jakirkham mentioned this pull request Nov 29, 2023

Rebuild for CUDA 12 w/arch + Windows support #17

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cutlass v3.1.0 #9

cutlass v3.1.0 #9

regro-cf-autotick-bot commented May 24, 2023 •

edited by h-vetinari

Loading

conda-forge-webservices bot commented May 24, 2023

h-vetinari commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

ngam commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

ngam commented Jul 31, 2023

ngam commented Jul 31, 2023

ngam commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

jakirkham commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

jakirkham commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

cutlass v3.1.0 #9

cutlass v3.1.0 #9

Conversation

regro-cf-autotick-bot commented May 24, 2023 • edited by h-vetinari Loading

Dependency Analysis

conda-forge-webservices bot commented May 24, 2023

h-vetinari commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

ngam commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

ngam commented Jul 31, 2023

ngam commented Jul 31, 2023

ngam commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

jakirkham commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

jakirkham commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

hmaarrfk commented Jul 31, 2023

h-vetinari commented Jul 31, 2023

regro-cf-autotick-bot commented May 24, 2023 •

edited by h-vetinari

Loading