Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tape_type query given a parent compiler job context #1104

Merged
merged 26 commits into from
Mar 4, 2024

Conversation

wsmoses
Copy link
Member

@wsmoses wsmoses commented Oct 19, 2023

Still needs a test, but pinging @vchuravy for review/thoughts

@wsmoses wsmoses requested a review from vchuravy October 19, 2023 18:53
@wsmoses
Copy link
Member Author

wsmoses commented Nov 20, 2023

@michel2323 @vchuravy re parent job/GPU things, if you want to try that out.

@vchuravy
Copy link
Member

Looks about right.

@wsmoses wsmoses force-pushed the tape_type_compilejob branch from 3ddc4a3 to fb74dc4 Compare November 21, 2023 01:39
@codecov-commenter
Copy link

codecov-commenter commented Nov 21, 2023

Codecov Report

Attention: 49 lines in your changes are missing coverage. Please review.

Comparison is base (ba58bcb) 75.13% compared to head (8701270) 74.75%.
Report is 2 commits behind head on main.

Files Patch % Lines
src/Enzyme.jl 4.08% 47 Missing ⚠️
src/absint.jl 33.33% 2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1104      +/-   ##
==========================================
- Coverage   75.13%   74.75%   -0.39%     
==========================================
  Files          35       35              
  Lines       10397    10442      +45     
==========================================
- Hits         7812     7806       -6     
- Misses       2585     2636      +51     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wsmoses
Copy link
Member Author

wsmoses commented Dec 5, 2023

@sriharikrishna

@michel2323
Copy link
Collaborator

michel2323 commented Dec 15, 2023

@wsmoses Trying to get GPUs in KA going with https://github.com/JuliaGPU/KernelAbstractions.jl/tree/enz_rev_gpu . I use the 'reverse_gpu.jl` script for development. Now, the compiler crashes.

In this PR I did this notable (uninformed) changes:

  • Uncomment your line world=... to use on fspec()
  • Changes key=hash((hash(job), parent_job) to key=hash((job, parent_job)). I'm not sure where tape_cache is used other than here. I guess it should also be used in the reverse.
  • tape_type() didn't return any type. So I return meta.TapeType.

I believe the GPU rules don't actually belong in KA but in the backends like CUDA.jl, but let's ignore that for the moment.

@michel2323
Copy link
Collaborator

The log.
out.log

@michel2323
Copy link
Collaborator

@vchuravy
Copy link
Member

Does this make any sense:

Yeah, you could probably use f and the unwrap the argument types, but identity is as good as any.

@michel2323
Copy link
Collaborator

@wmoses Same error. Here is the updated log against latest main branches.
out2.log

@wsmoses wsmoses force-pushed the tape_type_compilejob branch 4 times, most recently from d4fdc35 to 8beef7a Compare January 24, 2024 22:54
@michel2323 michel2323 force-pushed the tape_type_compilejob branch from a0196f3 to 8083509 Compare January 29, 2024 21:01
@michel2323
Copy link
Collaborator

autodiff_deferred_thunk doesn't go through GPUCompiler with CUDA. I commented back in the annotate(args..., but there seems to be more going on. Now it goes through this function, but the execution of forward fails.

src/Enzyme.jl Outdated Show resolved Hide resolved
src/Enzyme.jl Outdated Show resolved Hide resolved
src/Enzyme.jl Outdated Show resolved Hide resolved
src/Enzyme.jl Outdated

# output

(7.26, 2.2, [3.3])
```
"""
@inline function autodiff_deferred_thunk(::ReverseModeSplit{ReturnPrimal,ReturnShadow,Width,ModifiedBetweenT, RABI}, ::Type{FA}, ::Type{A}, args...) where {FA<:Annotation, A<:Annotation, ReturnPrimal,ReturnShadow,Width,ModifiedBetweenT, RABI<:ABI}
@inline function autodiff_deferred_thunk(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just delete this function? We could provide a helper that calls tape_type internally, but that seems less helpful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm fine nuking this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm fine nuking this

Nuked and changed the API. Moving TapeType to second argument.

src/Enzyme.jl Outdated
AugT = Compiler.AugmentedForwardThunk{Ptr{Cvoid}, FA, A2, TT, Val{width}, Val(ReturnPrimal), TapeType}
@assert AugT == typeof(nondef[1])
AdjT = Compiler.AdjointThunk{Ptr{Cvoid}, FA, A2, TT, Val{width}, TapeType}
@assert AdjT == typeof(nondef[2])
AugT(primal_ptr), AdjT(adjoint_ptr)
end

@inline function autodiff_deferred_thunk(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move TapeType after FA

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autodiff_deferred_thunk is variadic. As long as we keep both methods I think this could lead to some trouble to distinguish between the two in a call. Or maybe not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is A in contrast to A2

@michel2323
Copy link
Collaborator

Rebased and cleaned up. There are no tests yet for autodiff_deferred_thunk. Should I work on this before merging?

@wsmoses
Copy link
Member Author

wsmoses commented Feb 22, 2024 via email

@michel2323
Copy link
Collaborator

Added a test. That test with CuArray fails, though, but I guess that's expected as it's not a kernel but uses broadcast.

@michel2323 michel2323 requested a review from vchuravy February 26, 2024 17:59
@michel2323
Copy link
Collaborator

@wsmoses What still needs to be done here?

test/runtests.jl Outdated Show resolved Hide resolved
Copy link
Member Author

@wsmoses wsmoses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, except the Julia1.6 errors

@vchuravy any thoughts (maybe from the jit changes?)

@wsmoses
Copy link
Member Author

wsmoses commented Feb 29, 2024

@michel2323 bumping this, once 1.6 is fixed this should be good

@michel2323 michel2323 force-pushed the tape_type_compilejob branch from d2c78fd to 4711e88 Compare March 4, 2024 17:28
@michel2323 michel2323 force-pushed the tape_type_compilejob branch from 87079ee to d47478b Compare March 4, 2024 18:26
@michel2323
Copy link
Collaborator

michel2323 commented Mar 4, 2024

@michel2323 bumping this, once 1.6 is fixed this should be good

@wsmoses Was indeed an easy fix. I merged back main. I hope I resolved the conflicts correctly. Might be worth not rebasing.

@michel2323
Copy link
Collaborator

michel2323 commented Mar 4, 2024

I don't know about the this error after the merge. Can you look into this?

Simple Complex tests: Test Failed at /home/runner/work/Enzyme.jl/Enzyme.jl/test/runtests.jl:354
  Expression: ((autodiff(ReverseHolomorphic, Base.inv, Active, Active(3.0 + 4.0im)))[1])[1] ≈ 0.0112 + 0.0384im
   Evaluated: 0.0112 - 0.038400000000000004im ≈ 0.0112 + 0.0384im

Edit: Seems to be an issue on main and the prebuilt Enzyme_jll.

@wsmoses
Copy link
Member Author

wsmoses commented Mar 4, 2024

LGTM. I can't actually approve this since I'm the one who made the PR. @vchuravy can you give it a once over (in particular the orcv1), approve, and let's merge!

@wsmoses
Copy link
Member Author

wsmoses commented Mar 4, 2024

and yeah @michel2323 that's a "we should jll bump soon, but I haven't gotten to, since at visit weekend, and there's some other complex number stuff that should also land"

@michel2323
Copy link
Collaborator

michel2323 commented Mar 4, 2024

LGTM. I can't actually approve this since I'm the one who made the PR. @vchuravy can you give it a once over (in particular the orcv1), approve, and let's merge!

He wrote the orcv1 stuff. That one change I did was for sure only a typo level mistake.

Edit: Actually, he wrote 95% of the PR.

@wsmoses
Copy link
Member Author

wsmoses commented Mar 4, 2024

Okay I'll just force merge it then.

@wsmoses wsmoses merged commit ead0dc5 into main Mar 4, 2024
17 of 44 checks passed
@wsmoses wsmoses deleted the tape_type_compilejob branch March 4, 2024 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants