Don't preallocate GradientConfig in ForwardDiff backend by default #8

tpapp · 2023-02-13T08:52:08Z

Fixes #6.

Changes:

for the ForwardDiff backend, the default is now thread-safe. To preallocate a buffer, provide a type (this is a breaking API change, but I don't think anyone was using this feature anyway).
suggest copy to make copies of AD'd densities. This is only necessary for ForwardDiff, but allows a consistent AD-agnostic interface for the caller in threaded code.
documentation changes
various incidental changes

devmotion · 2023-02-13T09:21:22Z

this is a breaking API change, but I don't think anyone was using this feature anyway.

It's used, e.g., in Turing: https://github.com/TuringLang/Turing.jl/blob/99a1d23333d852fdb1485457444181a94c2a52a0/src/essential/ad.jl#L107 Being able to provide a GradientConfig is (currently) crucial for using custom tags.

ext/ForwardDiffExt.jl

Co-authored-by: David Widmann <[email protected]>

ext/ForwardDiffExt.jl

Co-authored-by: David Widmann <[email protected]>

ext/ForwardDiffExt.jl

Co-authored-by: David Widmann <[email protected]>

ext/ForwardDiffExt.jl

Co-authored-by: David Widmann <[email protected]>

…-threaded-use

tpapp · 2023-02-13T11:24:21Z

@devmotion: thanks for the thorough review, and correcting all my scatterbrained mistakes (I am at home with a sick child and I am less than fully focused on work).

I reverted the last suggestion related to tags, since simply exposing the GradientConfig interface is much simpler and non-breaking. Supporting a combination of tags, types (needed for tags), and whether to pre-allocate the GradientConfig would complicate the API a lot.

Please review and let me know if this would work with the way Turing.jl is using the package.

@sethaxen, since you reported #6 please comment about whether this API is convenient for you; preallocation is still possible but not the default, and copy is needed for each thread.

sethaxen · 2023-02-16T15:24:12Z

@sethaxen, since you reported #6 please comment about whether this API is convenient for you; preallocation is still possible but not the default, and copy is needed for each thread.

Yes, this will work for me. Thanks!

devmotion · 2023-02-16T16:10:24Z

Hmm I have to think a bit about whether this would work for Turing. At first glance, to me it seems that passing a fixed gradient config would mean that it would be easy to run into these threading issues with Turing and I'm not sure if we can prevent this from happening at all - users just declare/use @threads in their model but we don't have much control over it in Turing. There are some things in DynamicPPL to make log density accumulation somewhat threadsafe (it's not very nice and certainly contains bugs), maybe we could do something there. But it's at least not completely obvious to me, I have to check that more carefully.

devmotion · 2023-02-20T20:45:20Z

To me it still seems that this would not work well (i.e., be unsafe by default) when you want to use custom tags, as e.g. in Turing.

As the GradientConfig constructor only takes 2 arguments + the function to be differentiated and its input (https://github.com/JuliaDiff/ForwardDiff.jl/blob/4b143a199f541e7c1248dc5bc29b397be938e81d/src/config.jl#L118-L120), I think the easiest solution would be to support storing these two options (tag and chunk size):

Remove x::Union{Nothing,AbstractVector}
chunk::Union{Integer,ForwardDiff.Chunk} = _default_chunksize(ell): ForwardDiff chunk size (unchanged but always saved as ForwardDiff.Chunk)
tag::Union{Nothing,ForwardDiff.Tag} = nothing: ForwardDiff tag (new)
gradientconfig::Union{Nothing,ForwardDiff.GradientConfig} = nothing: gradient config (new default of nothing)

The logic would be:

If a gradientconfig !== nothing is supplied, the other two options are ignored and it is re-used in every logdensity_and_gradient call

Otherwise, in the logdensity_and_gradient call a new GradientConfig is constructed:

function logdensity_and_gradient(ell:ForwardDiffLogDensity, x::AbstractVector{<:Real})
    @unpack ℓ, chunk, tag, gradient_config = fℓ
    buffer = _diffresults_buffer(x)
    f = Base.Fix1(logdensity, ℓ)
    config = if gradient_config === nothing
        if tag === nothing
            ForwardDiff.GradientConfig(f, x, ℓ, chunk)
        else
            ForwardDiff.GradientConfig(f, x, ℓ, chunk, tag)
        end
    else
        gradient_config
    end
    result = ForwardDiff.gradient!(buffer, f, x, config)
    return _diffresults_extract(result)
end

The main difference to the PR would be that there's no default gradient config assembled from other options. If a user wants to cache the GradientConfig, one has to specify it explicitly.

tpapp · 2023-03-07T15:30:06Z

@devmotion: I am going to fix this along the lines you suggested, but made JuliaDiff/ForwardDiff.jl#63 for copying GradientConfigs, would be cleaner that way.

tpapp · 2023-03-08T09:32:45Z

@devmotion: I added an option for a custom tag (and realized that we can just copy a GradientConfig, so no PR to ForwardDiff is needed). With the latest PR, Turing.jl could just do something like

function LogDensityProblemsAD.ADgradient(ad::ForwardDiffAD, ℓ::Turing.LogDensityFunction)
    θ = DynamicPPL.getparams(ℓ)
    f = Base.Fix1(LogDensityProblems.logdensity, ℓ)

    # Define configuration for ForwardDiff.
    tag = if standardtag(ad)
        ForwardDiff.Tag(Turing.TuringTag(), eltype(θ))
    else
        ForwardDiff.Tag(f, eltype(θ))
    end
    chunk_size = getchunksize(ad)

    return LogDensityProblemsAD.ADgradient(Val(:ForwardDiff), ℓ; tag = tag,
                                           gradientconfig = config, chunk = chunk_size)
end

Please let me know if you are OK with this. The advantage is not exposing the whole mechanism for creating a closure etc.

ext/LogDensityProblemsADForwardDiffExt.jl

devmotion

Looks good to me apart from some minor comments! 👍

ext/LogDensityProblemsADForwardDiffExt.jl

Co-authored-by: David Widmann <[email protected]>

Fixes #6.

9883355

devmotion reviewed Feb 13, 2023

View reviewed changes

ext/ForwardDiffExt.jl Outdated Show resolved Hide resolved

tpapp and others added 2 commits February 13, 2023 10:42

Update ext/ForwardDiffExt.jl

5dcecb8

Co-authored-by: David Widmann <[email protected]>

fix missing type params

3cf61f6

devmotion reviewed Feb 13, 2023

View reviewed changes

ext/ForwardDiffExt.jl Outdated Show resolved Hide resolved

tpapp and others added 2 commits February 13, 2023 10:53

Update ext/ForwardDiffExt.jl

b29ac7e

Co-authored-by: David Widmann <[email protected]>

rename variable

2d84a6a

devmotion reviewed Feb 13, 2023

View reviewed changes

ext/ForwardDiffExt.jl Outdated Show resolved Hide resolved

tpapp and others added 3 commits February 13, 2023 11:05

fix typo

f8b1de0

Update ext/ForwardDiffExt.jl

ce96fd3

Co-authored-by: David Widmann <[email protected]>

revert to previous API extended with type

6a6688a

devmotion reviewed Feb 13, 2023

View reviewed changes

ext/ForwardDiffExt.jl Outdated Show resolved Hide resolved

tpapp and others added 3 commits February 13, 2023 11:53

Update ext/ForwardDiffExt.jl

1fb8ba8

Co-authored-by: David Widmann <[email protected]>

Merge remote-tracking branch 'origin/tp/fix-threaded-use' into tp/fix…

1d71536

…-threaded-use

fix test

dced6ca

tpapp requested a review from devmotion February 13, 2023 11:24

tpapp mentioned this pull request Mar 7, 2023

Add a copy method for GradientConfig. JuliaDiff/ForwardDiff.jl#632

Closed

Allow custom tag.

9f8b304

Merge branch 'master' into tp/fix-threaded-use

07531a4

devmotion reviewed Mar 8, 2023

View reviewed changes

ext/LogDensityProblemsADForwardDiffExt.jl Outdated Show resolved Hide resolved

tpapp added 2 commits March 8, 2023 11:40

bring back the x argument

b025afa

explicitly test with ComponentVector

deaba68

devmotion reviewed Mar 8, 2023

View reviewed changes

tpapp and others added 6 commits March 8, 2023 14:06

Update ext/LogDensityProblemsADForwardDiffExt.jl

9e332b2

Co-authored-by: David Widmann <[email protected]>

Update ext/LogDensityProblemsADForwardDiffExt.jl

9e21c97

Co-authored-by: David Widmann <[email protected]>

Update ext/LogDensityProblemsADForwardDiffExt.jl

314e538

Co-authored-by: David Widmann <[email protected]>

Update ext/LogDensityProblemsADForwardDiffExt.jl

3765b3d

Co-authored-by: David Widmann <[email protected]>

Update ext/LogDensityProblemsADForwardDiffExt.jl

57e1d36

Co-authored-by: David Widmann <[email protected]>

fix docstring end, minor formatting

cba4985

tpapp merged commit bc8c543 into master Mar 8, 2023

tpapp deleted the tp/fix-threaded-use branch March 8, 2023 13:29

tpapp mentioned this pull request Oct 9, 2024

Use DI for non-implemented ADTypes #39

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't preallocate GradientConfig in ForwardDiff backend by default #8

Don't preallocate GradientConfig in ForwardDiff backend by default #8

tpapp commented Feb 13, 2023

devmotion commented Feb 13, 2023

tpapp commented Feb 13, 2023

sethaxen commented Feb 16, 2023

devmotion commented Feb 16, 2023

devmotion commented Feb 20, 2023

tpapp commented Mar 7, 2023

tpapp commented Mar 8, 2023

devmotion left a comment

Don't preallocate GradientConfig in ForwardDiff backend by default #8

Don't preallocate GradientConfig in ForwardDiff backend by default #8

Conversation

tpapp commented Feb 13, 2023

devmotion commented Feb 13, 2023

tpapp commented Feb 13, 2023

sethaxen commented Feb 16, 2023

devmotion commented Feb 16, 2023

devmotion commented Feb 20, 2023

tpapp commented Mar 7, 2023

tpapp commented Mar 8, 2023

devmotion left a comment

Choose a reason for hiding this comment