Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use right caller for isapplicable usage #2151

Closed
wants to merge 8 commits into from

Conversation

vchuravy
Copy link
Member

@vchuravy vchuravy commented Dec 1, 2024

Originally I was thinking something like:

        # 1. Check if function is inactive
        @show fargs
        @show argtypes
        inactive_arginfo = ArgInfo(nothing, pushfirst!(copy(argtypes), Core.Const(EnzymeRules.inactive)))
        inactive_atype = Tuple{typeof(EnzymeRules.inactive), atype.parameters...} 
        @show inactive_arginfo
        @show inactive_atype
        inactive_meta  = @invoke Core.Compiler.abstract_call_gf_by_type(
            interp::AbstractInterpreter,
            EnzymeRules.inactive::Any,
            inactive_arginfo::ArgInfo,
            si::StmtInfo,
            inactive_atype::Any,
            sv::AbsIntState,
            max_methods::Int,
        )

        if Core.Compiler.nmatches(inactive_meta.info) == 0

E.g. doing an abstract call against the rules function and checking how many matches we would get,
this ought to do the edge handling for us. But this change is more minimal.

Copy link
Contributor

github-actions bot commented Dec 1, 2024

Benchmark Results

main 736ddc8... main/736ddc86c1c494...
basics/overhead 4.64 ± 0.011 ns 4.33 ± 0.01 ns 1.07
time_to_load 1.25 ± 0.026 s 1.18 ± 0.016 s 1.06

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@vchuravy
Copy link
Member Author

vchuravy commented Dec 1, 2024

Doesn't help as much as I hoped. So maybe I do have to try the other variant.

src/compiler/interpreter.jl Outdated Show resolved Hide resolved
src/compiler/interpreter.jl Outdated Show resolved Hide resolved
src/compiler/interpreter.jl Outdated Show resolved Hide resolved
lib/EnzymeCore/src/rules.jl Outdated Show resolved Hide resolved
@wsmoses
Copy link
Member

wsmoses commented Dec 1, 2024

x/ref #2152

ft, tt = EnzymeRules._annotate_tt(atype)
rule_atype = Tuple{typeof(EnzymeRules.forward), <:FwdConfig, <:Annotation{ft}, Type{<:Annotation}, tt...}
rule_argtypes = Any[Core.Const(EnzymeRules.forward), FwdConfig, Annotation{ft}, Type{<:Annotation}, tt...]
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically we'll need to change this to not be exclusive (e.g. you can have both forward_rules and reverse_rules set)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache is split on forward vs reverse. So we only need to look at it for each.

@wsmoses
Copy link
Member

wsmoses commented Dec 1, 2024

profile.pb.gz

profile data

@wsmoses
Copy link
Member

wsmoses commented Dec 1, 2024

okay changing the inner isactive part to the native interpreter brings "indirect function call return type analysis" back to 7s from 54s

@@ -233,7 +233,7 @@ function Core.Compiler.abstract_call_gf_by_type(
sv::AbsIntState,
max_methods::Int,
)
if Core.Compiler.nmatches(inactive_meta.info) != 0
if inactive_meta.info isa Core.Compiler.MethodMatchInfo && Core.Compiler.nmatches(inactive_meta.info) != 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other things have you seen here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoCallInfo

@vchuravy
Copy link
Member Author

vchuravy commented Dec 1, 2024

okay changing the inner isactive part to the native interpreter brings "indirect function call return type analysis" back to 7s from 54s

I am not sure I follow?

@wsmoses
Copy link
Member

wsmoses commented Dec 1, 2024

Before this PR the "indirect function call return type analysis" test ran in ~7 seconds. With this PR it runs in ~54 seconds. Changing

      inactive_meta  = @invoke Core.Compiler.abstract_call_gf_by_type(
            interp::AbstractInterpreter,
            EnzymeRules.inactive::Any,
            inactive_arginfo::ArgInfo,
            si::StmtInfo,
            inactive_atype::Any,
            sv::AbsIntState,

to

      ni = NativeInterpreter(interp.world)
      inactive_meta  = @invoke Core.Compiler.abstract_call_gf_by_type(
            ni::AbstractInterpreter,
            EnzymeRules.inactive::Any,
            inactive_arginfo::ArgInfo,
            si::StmtInfo,
            inactive_atype::Any,
            sv::AbsIntState,
            max_methods::Int,
        )

reduces it back to ~7s

@wsmoses
Copy link
Member

wsmoses commented Dec 1, 2024

However regardless independent of the native interpreter stuff (not pushed) this PR breaks invalidations, per CI:

has_frule     |   10     10  0.0s
autodiff(Forward, ...) custom rules: Test Failed at /home/runner/work/Enzyme.jl/Enzyme.jl/test/rules.jl:62
  Expression: (autodiff(Forward, (x->(#= /home/runner/work/Enzyme.jl/Enzyme.jl/test/rules.jl:62 =#
                f(x) ^ 2)), Duplicated(2.0, 1.0)))[1] ≈ 832.0
   Evaluated: 32.0 ≈ 832.0

Stacktrace:
 [1] macro expansion
   @ /opt/hostedtoolcache/julia/1.10.7/x64/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] macro expansion
   @ ~/work/Enzyme.jl/Enzyme.jl/test/rules.jl:62 [inlined]
 [3] macro expansion
   @ /opt/hostedtoolcache/julia/1.10.7/x64/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [4] top-level scope
   @ ~/work/Enzyme.jl/Enzyme.jl/test/rules.jl:61
autodiff(Forward, ...) custom rules: Test Failed at /home/runner/work/Enzyme.jl/Enzyme.jl/test/rules.jl:70
  Expression: res[1] ≈ 80032.0
   Evaluated: 32.0 ≈ 80032.0

Stacktrace:
 [1] macro expansion
   @ /opt/hostedtoolcache/julia/1.10.7/x64/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] macro expansion
   @ ~/work/Enzyme.jl/Enzyme.jl/test/rules.jl:70 [inlined]
 [3] macro expansion
   @ /opt/hostedtoolcache/julia/1.10.7/x64/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [4] top-level scope
   @ ~/work/Enzyme.jl/Enzyme.jl/test/rules.jl:61
autodiff(Forward, ...) custom rules: Test Failed at /home/runner/work/Enzyme.jl/Enzyme.jl/test/rules.jl:71
  Expression: res[2] ≈ 80096.0
   Evaluated: 96.0 ≈ 80096.0

Stacktrace:
 [1] macro expansion
   @ /opt/hostedtoolcache/julia/1.10.7/x64/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] macro expansion
   @ ~/work/Enzyme.jl/Enzyme.jl/test/rules.jl:71 [inlined]
 [3] macro expansion
   @ /opt/hostedtoolcache/julia/1.10.7/x64/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [4] top-level scope
   @ ~/work/Enzyme.jl/Enzyme.jl/test/rules.jl:61
Test Summary:                       | Pass  Fail  Total  Time
autodiff(Forward, ...) custom rules |    3     3      6  2.5s
ERROR: LoadError: Some tests did not pass: 3 passed, 3 failed, 0 

I also think inference time is not fixed yet =/

@vchuravy vchuravy force-pushed the vc/fixup_isapplicable_use branch from b23b076 to 400e81c Compare December 2, 2024 07:59
@vchuravy vchuravy closed this Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants