You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, AD settings in Turing are defined on a global level and (partly) propagated to other packages in this way. This requires us to dispatch depending on the Turing specific mutable global state and e.g. doesn't allow to perform parallel sampling (with a custom implementation) for different AD backends concurrently. The problem in #1400 and work on #1401 got me thinking: could we make the AD settings instead a local state of the AD-compatible samplers (similar to e.g. ODE algorithms in OrdinaryDiffEq)?
The main problem I see right now would be that other packages such as Bijectors or AdvancedVI still use a global state, which would have to be changed as well. Maybe Turing and other packages could use a common interface for computing gradients etc. with all supported AD packages that is defined in some other package. Then, for instance, Turing would not have to define gradient_logp for all supported AD backends but just call the method of this interface for computing the forward and reverse pass in lines such as
Currently, AD settings in Turing are defined on a global level and (partly) propagated to other packages in this way. This requires us to dispatch depending on the Turing specific mutable global state and e.g. doesn't allow to perform parallel sampling (with a custom implementation) for different AD backends concurrently. The problem in #1400 and work on #1401 got me thinking: could we make the AD settings instead a local state of the AD-compatible samplers (similar to e.g. ODE algorithms in OrdinaryDiffEq)?
The main problem I see right now would be that other packages such as Bijectors or AdvancedVI still use a global state, which would have to be changed as well. Maybe Turing and other packages could use a common interface for computing gradients etc. with all supported AD packages that is defined in some other package. Then, for instance, Turing would not have to define
gradient_logp
for all supported AD backends but just call the method of this interface for computing the forward and reverse pass in lines such asTuring.jl/src/core/ad.jl
Lines 144 to 147 in e1ab7e0
The text was updated successfully, but these errors were encountered: