Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slightly better unit propagation #109

Merged
merged 6 commits into from
May 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/src/datasets/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ make_output_domain
## Underlying data layouts

```@docs
AbstractDataLayout
AbstractLayout
OneToOne
ContiguouslyBinned
common_support
Expand Down
12 changes: 12 additions & 0 deletions docs/src/examples/optimizers.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,16 @@ DATADIR = length(get(ENV, "CI", "")) > 0 ? @__DIR__() * "/../../ex-datadir" : "/
spec1_path = joinpath(DATADIR, "s54405.pha")
data = OGIPDataset(spec1_path)
normalize!(data)

mask_energies!(data, 1, 15)

# a plotting utility
my_plot(data) = plot(
data,
xscale = :log10,
yscale = :log10,
ylims = (1e-3, 1.3)
)

my_plot(data)
```
2 changes: 1 addition & 1 deletion docs/src/examples/sherpa-example.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ y_noisy = y .+ (0.2 * randn(length(y)))
scatter(x, y_noisy)
```

To make this into a fittable dataset, we observe that our layout is injective (i.e. `length(x) == length(y)`). This is subtly different from how the majority of spectral models are implemented, which usually assume some kind of binning (`length(x) == length(y) + 1`). Fortunately, SpectralFitting.jl can track this for us, and do various conversion to make the models work correctly for the data. We need only tell the package what our [`AbstractDataLayout`](@ref) is:
To make this into a fittable dataset, we observe that our layout is injective (i.e. `length(x) == length(y)`). This is subtly different from how the majority of spectral models are implemented, which usually assume some kind of binning (`length(x) == length(y) + 1`). Fortunately, SpectralFitting.jl can track this for us, and do various conversion to make the models work correctly for the data. We need only tell the package what our [`AbstractLayout`](@ref) is:

```@example sherpa
data = InjectiveData(x, y_noisy; name = "example")
Expand Down
2 changes: 1 addition & 1 deletion docs/src/models/using-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ invokemodel(domain, model)
```julia
length(flux) == length(energy) - 1
```
Models need not be defined as such, however. See [`AbstractDataLayout`](@ref) for more.
Models need not be defined as such, however. See [`AbstractLayout`](@ref) for more.

Models can be composed together following the [Model algebra](@ref). That means to expressive a photoelectric absorption component acting on the power law we can write

Expand Down
2 changes: 2 additions & 0 deletions src/SpectralFitting.jl
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ abstract type AbstractMission end
struct NoMission <: AbstractMission end

abstract type AbstractStatistic end
struct ChiSquared <: AbstractStatistic end
struct Cash <: AbstractStatistic end

# unitful units
include("units.jl")
Expand Down
2 changes: 1 addition & 1 deletion src/abstract-models.jl
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ Model reflection is supported by the following functions. These are intended for
The parametric type parameter `T` is the number type of the model and `K` defines the [`AbstractSpectralModelKind`](@ref).
"""
abstract type AbstractSpectralModel{T,K<:AbstractSpectralModelKind} end
supports(::ContiguouslyBinned, ::Type{<:AbstractSpectralModel}) = true
supports(::Type{<:AbstractSpectralModel}) = (ContiguouslyBinned(),)

"""
numbertype(::AbstractSpectralModel)
Expand Down
6 changes: 3 additions & 3 deletions src/datasets/binning.jl
Original file line number Diff line number Diff line change
Expand Up @@ -40,21 +40,21 @@ function construct_objective_cache(
construct_objective_cache(preferred_support(model), T, model, domain)
end
function construct_objective_cache(
layout::AbstractDataLayout,
layout::AbstractLayout,
model::AbstractSpectralModel,
domain::AbstractVector{T},
) where {T}
construct_objective_cache(layout, T, model, domain)
end
function construct_objective_cache(
layout::AbstractDataLayout,
layout::AbstractLayout,
T::Type,
model::AbstractSpectralModel,
domain,
)
construct_objective_cache(layout, T, length(domain), objective_cache_count(model))
end
function construct_objective_cache(layout::AbstractDataLayout, T::Type, N::Int, size::Int)
function construct_objective_cache(layout::AbstractLayout, T::Type, N::Int, size::Int)
n = if layout isa OneToOne
N
elseif layout isa ContiguouslyBinned
Expand Down
82 changes: 63 additions & 19 deletions src/datasets/datasets.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Fitting data is considered to have an *objective* and a *domain*. As
the domain may be, for example, energy bins (high and low), or
fourier frequencies (single value), the purpose of this abstraction
is to provide some facility for translating between these representations
for the models to fit with. This is done by checking that the [`AbstractDataLayout`](@ref)
for the models to fit with. This is done by checking that the [`AbstractLayout`](@ref)
of the model and data are compatible, or at least have compatible translations.

Must implement a minimal set of accessor methods. These are paired with
Expand All @@ -17,7 +17,7 @@ Must implement a minimal set of accessor methods. These are paired with
into the translation. Usage of these functions should be sparse in the interest of
performance.

The arrays returned by the `make_*` functions must correspond to the [`AbstractDataLayout`](@ref)
The arrays returned by the `make_*` functions must correspond to the [`AbstractLayout`](@ref)
specified by the caller.

- [`make_objective_variance`](@ref)
Expand All @@ -30,6 +30,15 @@ Additionally there is an objective transformer that transforms the output of the
model onto the `output` domain:

- [`objective_transformer`](@ref)

Finally, to make all of the fitting for different statistical regimes work
efficiently, datasets should inform which units are preferred to fit. They may
also give the error statistics they prefer, and a label name primarily used to
disambiguate:

- [`preferred_units`](@ref)
- [`error_statistic`](@ref)
- [`make_label`](@ref)
"""
abstract type AbstractDataset end

Expand All @@ -46,47 +55,47 @@ end


"""
make_objective(layout::AbstractDataLayout, dataset::AbstractDataset)
make_objective(layout::AbstractLayout, dataset::AbstractDataset)

Returns the array used as the target for model fitting. The array must
correspond to the data [`AbstractDataLayout`](@ref) specified by the `layout`
correspond to the data [`AbstractLayout`](@ref) specified by the `layout`
parameter.

In as far as it can be guarunteed, the memory in the returned array will not be
mutated by any fitting procedures.

Domain for this objective should be returned by [`make_model_domain`](@ref).
"""
make_objective(layout::AbstractDataLayout, dataset::AbstractDataset) =
make_objective(layout::AbstractLayout, dataset::AbstractDataset) =
error("Layout $(layout) is not implemented for $(typeof(dataset))")

"""
make_objective_variance(layout::AbstractDataLayout, dataset::AbstractDataset)
make_objective_variance(layout::AbstractLayout, dataset::AbstractDataset)

Make the variance vector associated with each objective point.
"""
make_objective_variance(layout::AbstractDataLayout, dataset::AbstractDataset) =
make_objective_variance(layout::AbstractLayout, dataset::AbstractDataset) =
error("Layout $(layout) is not implemented for $(typeof(dataset))")

"""
make_model_domain(layout::AbstractDataLayout, dataset::AbstractDataset)
make_model_domain(layout::AbstractLayout, dataset::AbstractDataset)

Returns the array used as the domain for the modelling. This is paired with
[`make_domain_variance`](@ref)
"""
make_model_domain(layout::AbstractDataLayout, dataset::AbstractDataset) =
make_model_domain(layout::AbstractLayout, dataset::AbstractDataset) =
error("Layout $(layout) is not implemented for $(typeof(dataset))")

"""
make_domain_variance(layout::AbstractDataLayout, dataset::AbstractDataset)
make_domain_variance(layout::AbstractLayout, dataset::AbstractDataset)

Make the variance vector associated with the domain.
"""
make_domain_variance(layout::AbstractDataLayout, dataset::AbstractDataset) =
make_domain_variance(layout::AbstractLayout, dataset::AbstractDataset) =
error("Layout $(layout) is not implemented for $(typeof(dataset))")

"""
make_output_domain(layout::AbstractDataLayout, dataset::AbstractDataset)
make_output_domain(layout::AbstractLayout, dataset::AbstractDataset)

Returns the array used as the output domain. That is, in cases where the model
input and output map to different domains, the input domain is said to be the
Expand All @@ -95,11 +104,11 @@ model domain, the input domain is said to be the model domain.
The distinction is mainly used for the purposes of simulating data and for
visualising data.
"""
make_output_domain(layout::AbstractDataLayout, dataset::AbstractDataset) =
make_output_domain(layout::AbstractLayout, dataset::AbstractDataset) =
error("Layout $(layout) is not implemented for $(typeof(dataset))")

"""
objective_transformer(layout::AbstractDataLayout, dataset::AbstractDataset)
objective_transformer(layout::AbstractLayout, dataset::AbstractDataset)

Used to transform the output of the model onto the output domain. For spectral
fitting, this is the method used to do response folding and bin masking.
Expand All @@ -117,7 +126,7 @@ function _DEFAULT_TRANSFORMER()
end
```
"""
function objective_transformer(layout::AbstractDataLayout, dataset::AbstractDataset)
function objective_transformer(layout::AbstractLayout, dataset::AbstractDataset)
@warn "Using default objective transformer!"
_DEFAULT_TRANSFORMER()
end
Expand All @@ -132,23 +141,58 @@ function _DEFAULT_TRANSFORMER()
_transformer!!
end

make_label(d::AbstractDataset) = "$(Base.typename(typeof(d)).name)"
"""
preferred_units(::Type{<:AbstractDataset}, s::AbstractStatistic)
preferred_units(x, s::AbstractStatistic)

Get the preferred units that a given dataset would use to fit the
[`AbstractStatistic`](@ref) in. For example, for [`ChiSquared`](@ref), the units
of the model may be a rate, however for [`Cash`](@ref) the preferred units might
be counts.

Returning `nothing` from this function implies there is no unit preference.

If undefined for a derived type, returns `nothing`.

See also [`support_units`](@ref).
"""
preferred_units(::T, s::AbstractStatistic) where {T<:AbstractDataset} =
preferred_units(T, s)
preferred_units(::Type{<:AbstractDataset}, s::AbstractStatistic) = nothing

"""
error_statistic(::AbstractDataset)

Should return an [`ErrorStatistics`](@ref) describing which error statistic this
data uses.

If undefined for a derived type, returns `ErrorStatistics.Unknown`.
"""
error_statistic(::AbstractDataset) = ErrorStatistics.Unknown

"""
make_label(d::AbstractDataset)

Return a string that gives a descriptive label for this dataset.
"""
make_label(d::AbstractDataset) = "$(Base.typename(typeof(d)).name)"

"""
Must support the same API, but may also have some query methods for specific internals.
"""
abstract type AbstractMultiDataset <: AbstractDataset end

export AbstractDataset,
make_model_domain,
make_output_domain,
error_statistic,
make_domain_variance,
make_label,
make_model_domain,
make_objective,
make_objective_variance,
make_output_domain,
normalize!,
objective_transformer
objective_transformer,
preferred_units

include("spectrum.jl")
include("response.jl")
Expand Down
12 changes: 4 additions & 8 deletions src/datasets/injectivedata.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@ function InjectiveData(
InjectiveData(domain, codomain, domain_variance, codomain_variance, name, data_mask)
end

supports(::ContiguouslyBinned, ::Type{<:InjectiveData}) = true
supports(::OneToOne, ::Type{<:InjectiveData}) = true
supports(::Type{<:InjectiveData}) = (ContiguouslyBinned(), OneToOne())

function make_model_domain(::ContiguouslyBinned, dataset::InjectiveData)
# need to expand the domain by one
Expand All @@ -35,13 +34,10 @@ make_objective(::ContiguouslyBinned, dataset::InjectiveData) =
make_model_domain(::OneToOne, dataset::InjectiveData) = dataset.domain
make_objective(::OneToOne, dataset::InjectiveData) = dataset.codomain[dataset.data_mask]

make_output_domain(layout::AbstractDataLayout, dataset::InjectiveData) =
make_output_domain(layout::AbstractLayout, dataset::InjectiveData) =
make_model_domain(layout, dataset)

function make_objective_variance(
::AbstractDataLayout,
dataset::InjectiveData{V},
)::V where {V}
function make_objective_variance(::AbstractLayout, dataset::InjectiveData{V})::V where {V}
if !isnothing(dataset.codomain_variance)
dataset.codomain_variance[dataset.data_mask]
else
Expand All @@ -50,7 +46,7 @@ function make_objective_variance(
end
end

function objective_transformer(::AbstractDataLayout, dataset::InjectiveData)
function objective_transformer(::AbstractLayout, dataset::InjectiveData)
function _transformer!!(domain, objective)
@views objective[dataset.data_mask]
end
Expand Down
Loading
Loading