Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark #150

shivsundram · 2021-12-13T01:33:49Z

PR for implementing Legate Task Fusion.
This is the cuNumeric companion PR to the Core's Task Fusion PR nv-legate/legate#113
This PR contains 3 primary changes

Constant optimization for binary ops, implemented in cunumeric/array.py This removes the need to issue expensive "convert" ops for scalar constants embedded in the code
Changes to the black scholes benchmark so NaN checking is only done once all benchmarking runs have finished
New 27 pt stencil benchmark
Implementation of Task Fusion: Currently this exists as a cunumeric task, but the intention is too eventually move it to the core

Fix nv-legate#111

This reverts commit b698b33.

…y into shiv1/op_fusion

for more information, see https://pre-commit.ci

magnatelee · 2021-12-14T00:45:15Z

Thanks for the PR. I'll probably split this into a couple of independent PRs instead of merging it as-is, and want to move all fusion related code to the core.

shivsundram · 2021-12-14T01:07:33Z

cunumeric/array.py

+                        )
+                        temp._thunk.convert(
+                            two._thunk, stacklevel=(stacklevel + 1)
+                        )


above code is a scalar constant optimization, which avoids dispatching CONVERT operations (for a scalar constant), as the constant's value is embedded in the code and thus already known

marcinz · 2023-01-26T01:03:51Z

@magnatelee @shivsundram What is the status of this PR?

magnatelee · 2023-01-26T15:59:00Z

@marcinz same here. part of this PR should really be in the core, so I'll do the porting in the near future.

shivsundram · 2023-01-26T19:30:33Z

@marcinz @magnatelee Yeah this is/was a working PR (with some nice speedup results here if interested), but Wonchan will be porting/merging this functionality into the core. This PR is pretty stale right now
That said, the constant optimizations in this PR - that prevents dispatching expensive 'convert' Legate ops for converting the format (eg fp32->fp64) of already existing scalars - may be worth merging in earlier (if not already done). It's a smaller change and the speedups from it were quite tangible

Also take this opportunity to clean up a naming inconsistency; NumPy types are "dtypes", core types are "types".

shivsundram and others added 30 commits September 17, 2021 21:03

double add fused op

c7575d8

all inputs serialized for fused op

9c3879c

fusion via inlining, as well as function based fusion in fused_binary

0a5fae3

scalars reductions and opids, need to remove dynamic allocations

2b0b3b0

fusion metadata passed via serialization now

c301582

reuse serializer

01f9956

some timing scripts, also stuff

72e03b3

remove profiling files

4232c3c

re add in examples

5c67081

partial fusion

983985a

merge attempt 1

2ce8002

finishing merge

caac1ee

op registry working

ba128b1

add new fused dir

5bc9d7b

Update the package version for the release

3c9698b

Fix nv-legate#111

02d8ce6

Merge pull request nv-legate#116 from magnatelee/typo-fix

37273ae

Fix nv-legate#111

Decrease relative tolerance in allclose for float16 values

b698b33

Revert "Decrease relative tolerance in allclose for float16 values"

ac314de

This reverts commit b698b33.

Allow greater margin of error for tensordot with float16

4167c7a

gpu fused op

395ff6d

reduction fix

465a044

merge

65ffccf

merge again

5d3eab1

re add cuda fused

615c95a

fixing fuse file

13f95ad

more fused stuff

ce77d59

Merge branch 'shiv1/op_fusion2' of github.com:shivsundram/legate.nump…

00897a6

…y into shiv1/op_fusion

last merge fixes

0d60b82

constant optimization

e840297

Shiv Sundram and others added 14 commits December 1, 2021 17:06

add missing header

7a58b6d

27 pt stencil

5a38ef1

only do constant optimization for deferred arrays for some reason

f230fcb

remove old files, change to constant optimization

7922c3d

cleanup

d5908e1

cleanup

dce6226

cleanup

1c9fd1f

constant opt adjustment

1665cb5

merging

e541cf8

merging

aedf22d

undo last change

a3dd95a

cleanup fused op

1fe56d3

more cleanup

c8c69b8

[pre-commit.ci] auto fixes from pre-commit.com hooks

4c104dd

for more information, see https://pre-commit.ci

shivsundram mentioned this pull request Dec 13, 2021

Task Fusion nv-legate/legate#113

Open

Shiv Sundram and others added 6 commits December 12, 2021 18:37

omp changes

b4302a3

merge conflict

e265929

[pre-commit.ci] auto fixes from pre-commit.com hooks

92d8590

for more information, see https://pre-commit.ci

one more merge conflict

7391628

merge conflict

00687d9

[pre-commit.ci] auto fixes from pre-commit.com hooks

1783d65

for more information, see https://pre-commit.ci

shivsundram commented Dec 14, 2021

View reviewed changes

marcinz changed the base branch from branch-22.01 to branch-23.11 September 26, 2023 00:39

marcinz changed the base branch from branch-23.11 to branch-24.01 November 9, 2023 17:15

marcinz changed the base branch from branch-24.01 to branch-24.03 February 22, 2024 01:09

manopapad added a commit that referenced this pull request Nov 17, 2024

Expose is_supported_dtype to the public interface (#150)

0624001

Also take this opportunity to clean up a naming inconsistency; NumPy types are "dtypes", core types are "types".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark #150

Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark #150

shivsundram commented Dec 13, 2021 •

edited

Loading

magnatelee commented Dec 14, 2021

shivsundram Dec 14, 2021

marcinz commented Jan 26, 2023

magnatelee commented Jan 26, 2023

shivsundram commented Jan 26, 2023 •

edited

Loading

Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark #150

Are you sure you want to change the base?

Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark #150

Conversation

shivsundram commented Dec 13, 2021 • edited Loading

magnatelee commented Dec 14, 2021

shivsundram Dec 14, 2021

Choose a reason for hiding this comment

marcinz commented Jan 26, 2023

magnatelee commented Jan 26, 2023

shivsundram commented Jan 26, 2023 • edited Loading

shivsundram commented Dec 13, 2021 •

edited

Loading

shivsundram commented Jan 26, 2023 •

edited

Loading