You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for re-running the benchmarks @nsiccha . Per our discussions offline @yebai , I saw a few additional allocations associated to some code involving small-unions when I was sorting out the original performance problem. These plainly aren't as catastrophic as the original problem.
The other thing to say is that you should still expect Enzyme to beat Mooncake in situations involving loops containing cheap operations at each iteration (see #156) -- I suspect this is where the bulk of the performance difference is coming from in this example. I'll do some profiling when I get some time.
Extra allocations (v.s. primal) are Mooncake's most crucial performance limitation. See, e.g., leftmost models of
https://nsiccha.github.io/StanBlocks.jl/performance.html#visualization
where the primal model has zero allocation, while Mooncake allocates 20.
@willtebbutt Is there any known reason beyond #403?
Ref: nsiccha/StanBlocks.jl#3 (comment)
The text was updated successfully, but these errors were encountered: