Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark #150

Open
wants to merge 55 commits into
base: branch-24.03
Choose a base branch
from

Conversation

shivsundram
Copy link

@shivsundram shivsundram commented Dec 13, 2021

PR for implementing Legate Task Fusion.
This is the cuNumeric companion PR to the Core's Task Fusion PR nv-legate/legate#113
This PR contains 3 primary changes

  1. Constant optimization for binary ops, implemented in cunumeric/array.py This removes the need to issue expensive "convert" ops for scalar constants embedded in the code
  2. Changes to the black scholes benchmark so NaN checking is only done once all benchmarking runs have finished
  3. New 27 pt stencil benchmark
  4. Implementation of Task Fusion: Currently this exists as a cunumeric task, but the intention is too eventually move it to the core

@magnatelee
Copy link
Contributor

Thanks for the PR. I'll probably split this into a couple of independent PRs instead of merging it as-is, and want to move all fusion related code to the core.

)
temp._thunk.convert(
two._thunk, stacklevel=(stacklevel + 1)
)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

above code is a scalar constant optimization, which avoids dispatching CONVERT operations (for a scalar constant), as the constant's value is embedded in the code and thus already known

@marcinz
Copy link
Collaborator

marcinz commented Jan 26, 2023

@magnatelee @shivsundram What is the status of this PR?

@magnatelee
Copy link
Contributor

@marcinz same here. part of this PR should really be in the core, so I'll do the porting in the near future.

@shivsundram
Copy link
Author

shivsundram commented Jan 26, 2023

@marcinz @magnatelee Yeah this is/was a working PR (with some nice speedup results here if interested), but Wonchan will be porting/merging this functionality into the core. This PR is pretty stale right now
That said, the constant optimizations in this PR - that prevents dispatching expensive 'convert' Legate ops for converting the format (eg fp32->fp64) of already existing scalars - may be worth merging in earlier (if not already done). It's a smaller change and the speedups from it were quite tangible

@marcinz marcinz changed the base branch from branch-22.01 to branch-23.11 September 26, 2023 00:39
@marcinz marcinz changed the base branch from branch-23.11 to branch-24.01 November 9, 2023 17:15
@marcinz marcinz changed the base branch from branch-24.01 to branch-24.03 February 22, 2024 01:09
manopapad added a commit that referenced this pull request Nov 17, 2024
Also take this opportunity to clean up a naming inconsistency;
NumPy types are "dtypes", core types are "types".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants