Convinience Constructors #323

JaimeRZP · 2023-06-21T10:48:11Z

Dear All,

I have drafted some convnience constructors for HMCSampler in here.

The way this constructors work is by adding an extra field to HMCSampler called alg, the sampling algorithm.
When a user calls the convinience constructor AHMC.NUTS a HMCSampler instance is created with a NUTS sampling algorithm and nothing kernel, metric, adaptor and integrator. The first step of the sampling then detects that the kernel, metric, adaptor and integrator of the sampler are nothing and builds them following the type of the sampling algorithm and the hyperparameters it hosts.

This has the following advantages:

Casual users can simply call AHMC.NUTS a la Turing wihtout having to deal with specifying their own kernel, integrator, etc within AdvanvedHMC.
Power users can create their own samplers by providing their desired sampling algorithm (or none at all) and their custom kernel, integrator, adaptor, etc.

Talking with @torfjelde we were worried that this approach however introduces yet another layer in how samplers are build.
With the current Turing interface for external samplers, we would have something like this going under the hood:
DynamicPPL.Sampler(Turing.InfereceAlgorithm(AbstractMCMC.AbstractSampler(HMCSampler)))

I would argue that this current solution doesn't really add more complexity to the sampler type since it is all stored inside a field of an already existing structure.

Finally note that in the future we might actually get rid of the Turing.InfereceAlgorithm step if we eventually move all the constructors to the sampling libraries.

Thoughts @yebai, @torfjelde?

The text was updated successfully, but these errors were encountered:

torfjelde · 2023-06-21T13:30:51Z

Effectively what me and @JaimeRZP are disucssing is the following two approaches.

The fully flattened approach, i.e. something like

Base.@kwdef struct HMC <: AbstractMCMC.AbstractSampler
    termination_criterion=:fixed_steps

    # Sampling of the trajectory.
    trajectory_sampling=:endpoint
    integration_time=1.0
    num_integration_steps=16
    # NUTS related
    max_depth=10
    Δ_max=1000.0

    metric=:diagonal

    stepsize=:auto
    integrator=:leapfrog

    # Adaptation.
    adapt=true
    adapt_mass_matrix=true
    adapt_stepsize=true
    num_adapts=0.5
    target_acceptance_rate=0.8
end

NUTS(; num_adapts, δ, max_depth, Δ_max, ϵ) = HMC(; num_adapts, δ, max_depth, Δ_max, ϵ)

and then just construct HMCSampler in the step, and defer to the existing implementation.

And the alternative the Jaime proposes is more like

# Replaces `HMCSampler`
Base.@kwdef struct HMCSampler{I,K,M,A} <: AbstractMCMC.AbstractSampler
    alg::HMCAlgorithm=Custom_alg
    "[`integrator`](@ref)."
    integrator::I=nothing
    "[`AbstractMCMCKernel`](@ref)."
    kernel::K=nothing
    "[`AbstractMetric`](@ref)."
    metric::M=nothing
    "[`AbstractAdaptor`](@ref)."
    adaptor::A=nothing
end

# Example alg
struct NUTS_alg <: AdaptiveHamiltonian
    n_adapts::Int     # number of samples with adaption for ϵ
    δ::Float64        # target accept rate
    max_depth::Int    # maximum tree depth
    Δ_max::Float64    # maximum error
    ϵ::Float64        # (initial) step size
end

where the HMCAlgorithm would then define the different recipes, e.g. NUTS, and completely replace HMCSampler.

I'm personally slightly in favour of approach (1) because I think that

a power-user who wants to construct their own integrator can just use the "full" interface rather than this simple keyword-approach
for "simpler" users I think seeing the algorithm stuff can be confusing (and this is the user group we want to target with this change)

Jaime prefers (2) because

it allows arbitrary customization
can completely replace HMCSampler
closer to current Turing approach

torfjelde · 2023-06-21T13:31:05Z

@yebai @devmotion @sethaxen Thoughts?

devmotion · 2023-06-21T13:52:21Z

When a user calls the convinience constructor AHMC.NUTS a HMCSampler instance is created

I think generally constructors have to be faithful about the types they create - IMO a NUTS function has to return something of type NUTS.

My first thought would be: Maybe we can just use one sampler type (possibly with hierarchically structured fields) but provide different convenience constructors with default metrics, algorithms etc?

torfjelde · 2023-06-21T14:09:36Z

I think generally constructors have to be faithful about the types they create - IMO a NUTS function has to return something of type NUTS.

Fair point and I do agree with this. But then the alternative is to just create separate samplers for each of the different "main" configurations (as is currently done in Turing), but then we share the step somehow.

My first thought would be: Maybe we can just use one sampler type (possibly with hierarchically structured fields) but provide different convenience constructors with default metrics, algorithms etc?

Not a big fan of this because we really want something for users who just want to do NUTS() or HMCDA(), and not have to muck about with HMC(nuts=...) or something.

devmotion · 2023-06-21T15:54:35Z

Not a big fan of this because we really want something for users who just want to do NUTS() or HMCDA(), and not have to muck about with HMC(nuts=...) or something.

But can't we have a single sampler type AND such convenience constructors? I imagine we could maybe make NUTS and HMCDA type aliases of some parametric version of a generic HMC sampler type, and then add convenience constructors for those that don't require to specify anything.

torfjelde · 2023-06-21T15:58:44Z

But can't we have a single sampler type AND such convenience constructors?

Exactly, so in that case I suggest just using different samplers for the different types.

I imagine we could maybe make NUTS and HMCDA type aliases of some parametric version of a generic HMC sampler type, and then add convenience constructors for those that don't require to specify anything.

This is very similar to what @harisorgn wants, right? (Though I'm guessing without the integrator field, etc.) With the type alias, I'm much more of a fan of that approach 👍

devmotion · 2023-06-21T16:11:14Z

This is very similar

Yes, I guess it's quite similar - but conceptually I wouldn't make a difference between sampler and algorithm, for me the sampler is the algorithm (and hence also include other required things such as metric, kernel, integrator, ...). The fields could still be organized in a similar hierarchical way as proposed above by bundling NUTS-specific options in a NUTSOptions struct etc - which would also make it possible to define the type aliases of the sampler/algorithm.

torfjelde · 2023-06-21T17:04:49Z

Makes sense.

One of my criticisms of including the integrator, etc. is that we end up with cases where the combinations might not make sense. E.g. if I do Sampler(NUTSConfig(), termination_criterion=FixedNSteps(...)). In that scenario, I'm clearly no longer doing NUTS, and so what's the point?

That's why I'd rather have something like

struct Sampler{C}
    config::C
end

const NUTS = Sampler{<:NUTSConfig}

and then the integrator, etc. are only constructed in the initialstep and then passed down along with the state, instead of

Base.@kwdef struct Sampler{C,I}
    config::C
    integrator::I=nothing
    # ...
end

const NUTS = Sampler{<:NUTSConfig}

devmotion · 2023-06-21T17:33:17Z

I agree regarding the last point, I'd say anything that's always computed when sample is called rather than specified a-priori seems to belong in the state, not the sampler. But I assumed that some of the fields suggested above are fixed upfront.

If that's not the case for any of the other fields apart from NUTSOptions, then I'd suggest a different design: Extract common options to the sampler struct, and if no such options exist, rather have an abstract supertype with NUTS etc subtypes.

devmotion · 2023-06-21T17:36:04Z

If the main concern are inconsistent or unreasonable combinations of fields/settings, then one could add correctness checks in the constructors. And/Or restrict some additional type parameters in the aliases.

torfjelde · 2023-06-22T10:17:43Z

But I assumed that some of the fields suggested above are fixed upfront.

They could be, yeah. But then we should just have one separate, a la the current HMCSampler, which is for power-users, where everything can be specified beforehand, instead of including this + all the other config options into a single type.

Extract common options to the sampler struct, and if no such options exist, rather have an abstract supertype with NUTS etc subtypes.

This to me seems the way to go to me. It's a) very simple, and b) easy for the user.

If the main concern are inconsistent or unreasonable combinations of fields/settings, then one could add correctness checks in the constructors.

My concern is more that we're just adding complexity that no one will ever take advantage of. If you're an advanced user who is willing to construct their own adaptor or integrator, then adding two more lines to construct the remainding objects is really not a big deal. Hence I don't think there's much use in making this alternative interface, whose main target is the layman user, sufficiently general to also cover the the scenario where the user wants to construct everything by hand; that already exists. So, IMO, this new user-interface should be dead-simple, e.g. different types for the different "classical" HMC types.

torfjelde · 2023-06-26T13:22:55Z

@devmotion Just to clarify, you mean we should do something like

struct HMC{C}
    config::C
end

const NUTS = HMC{<:NUTSConfig}

right?

But, if we don't find any good "shared" parameters, you agree we should just do

abstract type AbstractHMCSampler end

struct NUTS <: AbstractHMCSampler
    # ...
end

?

yebai · 2023-06-26T14:56:06Z

Some related discussions inspired by DynamicHMC interface #313 (comment)

JaimeRZP · 2023-06-28T13:49:12Z

I don't think we actually need:

abstract type AbstractHMCSampler end

struct NUTS <: AbstractHMCSampler
    # ...
end

but just

struct NUTS <: AbstractMCMC.AbstractSampler
    # ...
end

See this PR for the consensus solution we have found!

devmotion · 2023-06-28T14:04:14Z

@devmotion Just to clarify, you mean we should do something like

Sorry, missed your comment 😬 My idea was slightly different - I thought your first example would motivate the second design, but as soon as there are common fields I'd use a single type.

But before commenting any further I'll have a look at the PR 🙂

JaimeRZP · 2023-07-25T08:57:46Z

This is now merged!

JaimeRZP closed this as completed Jul 25, 2023

yebai mentioned this issue Aug 2, 2023

Sensible Defaults #184

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convinience Constructors #323

Convinience Constructors #323

JaimeRZP commented Jun 21, 2023

torfjelde commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 22, 2023

torfjelde commented Jun 26, 2023

yebai commented Jun 26, 2023 •

edited

Loading

JaimeRZP commented Jun 28, 2023

devmotion commented Jun 28, 2023

JaimeRZP commented Jul 25, 2023

Convinience Constructors #323

Convinience Constructors #323

Comments

JaimeRZP commented Jun 21, 2023

torfjelde commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 21, 2023

devmotion commented Jun 21, 2023

devmotion commented Jun 21, 2023

torfjelde commented Jun 22, 2023

torfjelde commented Jun 26, 2023

yebai commented Jun 26, 2023 • edited Loading

JaimeRZP commented Jun 28, 2023

devmotion commented Jun 28, 2023

JaimeRZP commented Jul 25, 2023

yebai commented Jun 26, 2023 •

edited

Loading