Merge pull request #104 from fjebaker/fergus/model-api

Rework model API and reflection
fjebaker · May 21, 2024 · 70eea68 · 70eea68
2 parents ca72a88 + 022d33f
commit 70eea68
Show file tree

Hide file tree

Showing 48 changed files with 1,564 additions and 1,351 deletions.
diff --git a/docs/make.jl b/docs/make.jl
@@ -1,5 +1,7 @@
 push!(LOAD_PATH, "src")
 
+ENV["GKSwstype"] = "100"
+
 using Documenter
 using SpectralFitting
 
@@ -13,21 +15,23 @@ makedocs(
     pages = [
         "Home" => "index.md",
         "Walkthrough" => "walkthrough.md",
-        "Why & How" => "why-and-how.md",
         "Examples" => [
             "Diverse examples" => "examples/examples.md",
             "A quick guide" => "examples/sherpa-example.md",
         ],
-        # "Transitioning from XSPEC" => "transitioning-from-xspec.md",
         "Models" => [
-            "Using models" => "using-models.md",
-            "Model index" => "models.md",
-            "Composite models" => "composite-models.md",
-            "Surrogate models" => "surrogate-models.md",
+            "Using models" => "models/using-models.md",
+            "Model index" => "models/models.md",
+            "Composite models" => "models/composite-models.md",
+            "Surrogate models" => "models/surrogate-models.md",
         ],
         # "Parameters" => "parameters.md",
-        # "Datasets" => "datasets.md",
+        "Datasets" => [
+            "Using datasets" => "datasets/datasets.md",
+            "Mission support" => "datasets/mission-support.md",
+        ],
         # "Fitting" => "fitting.md",
+        "Why & How" => "why-and-how.md",
         "Reference" => "reference.md",
     ],
 )

diff --git a/docs/src/datasets.md b/docs/src/datasets.md
diff --git a/docs/src/datasets/datasets.md b/docs/src/datasets/datasets.md
@@ -0,0 +1,54 @@
+# Datasets
+
+SpectralFitting.jl supports a wide variety of datasets, and makes it easy to wrap your own.
+
+For spectral fitting specifics, the main dataset type is
+
+```@docs
+SpectralData
+```
+
+This is comprised out of
+
+```@docs
+Spectrum
+ResponseMatrix
+AncillaryResponse
+```
+
+
+## Dataset abstraction
+
+Datasets must define a small API to make fitting possible. The picture to have in mind when considering the different domains is as follows: the model is trying to predict the objective. It does so by taking in input domain and maps it to some output domain.
+
+That means [`make_output_domain`](@ref) and [`make_objective_domain`](@ref) correspond to the $(X,Y)$ values of the data that the model is trying to fit, whilst the model is evaluated on the [`make_model_domain`](@ref), which need not be the same as the output domain.
+
+In other cases, the [`objective_transformer`](@ref) acts to transform the output of the model onto the output domain. 
+
+Mathematically, expressing the output domain $X$, the model domain $D$, the model output $M(D)$ and objective $S$, along with the transformer as $T$, then the relationship between the different domains is
+
+```math
+\hat{S} = T \times M(D),
+```
+
+Both $\hat{S}$ and $S$ are defined over $X$. The various fitting operations try to find model paramters that make $\hat{S}$ and $S$ as close as possible.
+
+```@docs
+AbstractDataset
+make_objective_variance
+make_objective
+make_domain_variance
+make_model_domain
+make_output_domain
+```
+
+## Underlying data layouts
+
+```@docs
+AbstractDataLayout
+OneToOne
+ContiguouslyBinned
+common_support
+preferred_support
+supports
+```
diff --git a/docs/src/datasets/mission-support.md b/docs/src/datasets/mission-support.md
diff --git a/docs/src/examples/optimizers.md b/docs/src/examples/optimizers.md
@@ -0,0 +1,13 @@
+# Optimizer galore
+
+Let's fit a spectrum:
+
+```@example optimizers
+using SpectralFitting
+
+DATADIR = "..."
+DATADIR = length(get(ENV, "CI", "")) > 0 ? @__DIR__() * "/../../ex-datadir" : "/home/lilith/Developer/jl/datasets/xspec/walkthrough" # hide
+spec1_path = joinpath(DATADIR, "s54405.pha")
+data = OGIPDataset(spec1_path) 
+normalize!(data)
+```
diff --git a/docs/src/examples/xrism-simulation.jl b/docs/src/examples/xrism-simulation.jl
@@ -0,0 +1,96 @@
+using Revise
+using SpectralFitting, Plots
+using Profile
+
+DATADIR = "/home/lilith/Developer/jl/datasets/xrism/"
+paths = SpectralFitting.SpectralDataPaths(;
+    response = joinpath(DATADIR, "rsl_Hp_5eV.rmf"),
+    ancillary = joinpath(DATADIR, "rsl_pointsource_GVclosed.arf"),
+)
+
+DATADIR = "/home/lilith/Developer/jl/datasets/xspec/walkthrough/"
+paths = SpectralFitting.SpectralDataPaths(; response = joinpath(DATADIR, "s54405.rsp"))
+
+_, resp, _, anc = @time SpectralFitting._read_all_ogip(paths)
+
+begin
+    model = XS_Gaussian() + XS_Gaussian(E = FitParam(10.0))
+    erange = 10 .^ collect(range(log10(4), log10(20.0), 100))
+    model2 = GaussianLine() + GaussianLine(μ = FitParam(10.0))
+    plot(
+        erange[1:end-1],
+        @time(invokemodel(erange, model)),
+        marker = :o,
+        markersize = 4,
+        label = "XS",
+    )
+    plot!(
+        erange[1:end-1],
+        @time(invokemodel(erange, model2)),
+        marker = :o,
+        markersize = 4,
+        label = "jl",
+    )
+    vline!([6.4], label = false)
+end
+
+faked = @time OGIPDataset(joinpath(DATADIR, "s54405.fak.fak"))
+model = XS_PowerLaw(a = FitParam(2.0), K = FitParam(1.0))
+
+emids = push!(copy(resp.channel_bins_low), last(resp.channel_bins_high))
+evald = push!(copy(resp.bins_low), last(resp.bins_high))
+out = resp.matrix * invokemodel(evald, model)
+plot(emids[1:end-1], out)
+
+
+sims = @time simulate(model, resp, anc; exposure_time = 1e3, seed = 42)
+plot(sims, xscale = :log10, yscale = :log10, xlims = (1.0, 30.0), ylims = (1e-6, 100.0))
+
+begin
+    sims = @time simulate(model, resp, nothing; exposure_time = 1e5, seed = 42)
+    normalize!(faked)
+    plot(
+        faked,
+        xscale = :log10,
+        # yscale = :log10,
+        xlims = (1.0, 100.0),
+        # ylims = (1e-6, 1.0)
+    )
+    plot!(sims)
+    plot!(
+        emids[1:end-1] .+ diff(emids) ./ 2,
+        out .* 1.4,
+        xscale = :log10,
+        label = "Hand folded",
+    )
+end
+
+prob = FittingProblem(model => sims)
+result = @time fit(prob, LevenbergMarquadt())
+
+plot!(faked, result)
+
+
+sims = @time simulate(model, resp, anc; exposure_time = 1e3, seed = 42)
+plot(
+    sims,
+    xscale = :log10,
+    # yscale = :log10,
+    xlims = (1.0, 10.0),
+    ylims = (1e-6, 40.0),
+)
+
+model = GaussianLine(μ = FitParam(7.0)) + PowerLaw(a = FitParam(0.2))
+
+sims = @time simulate(model, resp, nothing; exposure_time = 1e5, seed = 42)
+
+plot(
+    sims,
+    xscale = :log10,
+    # yscale = :log10,
+    xlims = (1.0, 10.0),
+)
+
+model = GaussianLine()
+
+result = @time fit(FittingProblem(model => sims), LevenbergMarquadt(); max_iter = 10)
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -6,14 +6,21 @@ SpectralFitting.jl is a package for defining and using spectral models, with a n
 
 SpectralFitting is designed to be extended, such that new models are simple to create, and new dataset processing pipelines for different missions are brief to define. Where performance is key, SpectralFitting helps you define fast and AD-compatible surrogates of spectral models using [Surrogates.jl](https://github.com/SciML/Surrogates.jl), and embed them in the model composition algebra.
 
-To get started, install LibXSPEC_jll and SpectralFitting:
+To get started, add the AstroRegistry from the University of Bristol and then install:
+
 ```julia
-using Pkg
-Pkg.add(url = "https://github.com/astro-group-bristol/LibXSPEC_jll.jl")
-Pkg.add(url = "https://github.com/fjebaker/SpectralFitting.jl")
+julia>]
+pkg> registry add https://github.com/astro-group-bristol/AstroRegistry
+pkg> add SpectralFitting
+```
+
+Then use
 
+```julia
 using SpectralFitting
 # ....
 ```
 
+to get started. See [Walkthrough](@ref) for an example walkthrough the package.
+
 For more University of Bristol Astrophysics Group codes, see [our GitHub organisation](https://github.com/astro-group-bristol).
diff --git a/docs/src/composite-models.md → docs/src/models/composite-models.md b/docs/src/composite-models.md → docs/src/models/composite-models.md
diff --git a/docs/src/models.md → docs/src/models/models.md b/docs/src/models.md → docs/src/models/models.md
@@ -1,4 +1,4 @@
-# Models index
+# Model index
 
 Models wrapped from XSPEC implementations are prefixed with `XS_*`, whereas pure-Julia models are simply named, e.g. [`XS_PowerLaw`](@ref) in XSPEC vs [`PowerLaw`](@ref) in Julia.
 

diff --git a/docs/src/surrogate-models.md → docs/src/models/surrogate-models.md b/docs/src/surrogate-models.md → docs/src/models/surrogate-models.md
@@ -142,7 +142,7 @@ nothing # hide
 Now that we have the surrogate model, we use [`SurrogateSpectralModel`](@ref) to wrap it into an [`AbstractSpectralModel`](@ref). The constructor also needs to know the model kind, have a copy of the model parameters, and know which symbols to represent the parameters with.
 
 ```@example surrogate_example
-sm = @code_warntype make_model(harness)
+sm = make_model(harness)
 ```
 
 We can now use the familiar API and attempt to benchmark the performance:

diff --git a/docs/src/models/using-models.md b/docs/src/models/using-models.md
@@ -0,0 +1,140 @@
+# Using spectral models
+
+```@setup using_models
+using SpectralFitting
+using Plots
+```
+
+In this page you'll find how to use the spectral model library and how to define your own models. Using the model library is as easy as invoking or composing models from the [Model Index](@ref). For example:
+
+```@example using_models
+model = PowerLaw()
+```
+
+In the output of the REPL we see the model name, and it's two parameters, with information about those parameters, such as the current value, the associated error (10% by defaul), the minimum and maximum values, and whether the parameter is frozen or not.
+
+!!! note
+
+    See [`FitParam`](@ref) for full details about fitable parameters.
+
+The parameters can be tweaked by accessing the fields
+
+```@example using_models
+model.K.value = 2.0
+model.K.frozen = true
+model
+```
+
+We can invoke the model on a domain in the following way
+
+```@example using_models
+domain = collect(range(0.1, 10.0, 100))
+invokemodel(domain, model)
+```
+
+!!! note
+    By default, models are implemented to accept a single input vector with all of the low and high bin edges, and return a flux array with the flux in each energy bin. As such, it is here the case that:
+    ```julia
+    length(flux) == length(energy) - 1
+    ```
+    Models need not be defined as such, however. See [`AbstractDataLayout`](@ref) for more.
+
+Models can be composed together following the [Model algebra](@ref). That means to expressive a photoelectric absorption component acting on the power law we can write
+
+```@example using_models
+model2 = PhotoelectricAbsorption() * model
+```
+
+The parameters of this [`CompositeModel`](@ref) are are copied from the expression. This means we can modify the `K_1` parameter in `model2` without having to worry that we are changing `model.K`:
+
+```@example using_models
+model2.K_1.frozen = false
+model2
+```
+
+```@example using_models
+model
+```
+
+Composite models have the same methods as single models. This means we can invoke a model in the same way
+
+```@example using_models
+invokemodel(domain, model2)
+```
+
+## Defining new models
+
+To define your own model, you need to tell the package what the model parameters are and how to invoke the model. This is all done by creating a `struct` which subtypes [`AbstractSpectralModel`](@ref).
+
+Let's create a new [`Additive`](@ref) spectral model:
+
+```@example using_models
+Base.@kwdef struct MyModel{T} <: AbstractSpectralModel{T,Additive}
+    K::T = FitParam(2.0)
+    p::T = FitParam(3.0)
+end
+
+# implementing a dummy add operation this function can do anything it likes, but
+# must write the output into `output` and ideally should be thread safe
+function SpectralFitting.invoke!(output, input, model::MyModel)
+    SpectralFitting.finite_diff_kernel!(output, input) do E
+        E + model.p
+    end
+end
+```
+
+Here we used the utility method [`SpectralFitting.finite_diff_kernel!`](@ref) to ensure the additive model is appropriately scaled across the bin width.
+
+Note that [`Additive`](@ref) models do not need to use the normalization parameter `K` themselves. This is because when we use [`invokemodel`](@ref) these sorts of translations are automatically applied, for compatability with external models.
+
+Our model is now ready to use
+```@example using_models
+model = MyModel()
+```
+
+```@example using_models
+domain = collect(range(0.1, 10.0, 100))
+invokemodel(domain, model)
+```
+
+## Model abstraction
+
+All spectral models are a sub-type of [`AbstractSpectralModel`](@ref).
+
+```@docs
+AbstractSpectralModel
+SpectralFitting.invoke!
+modelkind
+numbertype
+implementation
+```
+
+## Model methods
+
+```@docs
+invokemodel
+invokemodel!
+```
+
+## Model algebra
+
+Models exist as three different kinds, defined by an [`AbstractSpectralModelKind`](@ref) trait.
+```@docs
+AbstractSpectralModelKind
+Additive
+Multiplicative
+Convolutional
+```
+## Model data availability
+
+Many of the XSPEC implemented models use tabular data, such as FITS, and return results interpolated from these pre-calculated tables. In some cases, these table models have data files that are multiple gigabytes in size, and would be very unwieldy to ship indiscriminantly. SpectralFitting attempts to circumnavigate this bloat by downloading the model data on an _ut opus_ basis.
+
+```@docs
+SpectralFitting.download_model_data
+SpectralFitting.download_all_model_data
+```
+
+Special care must be taken if new XSPEC models are wrapped to ensure the data is available. For more on this, see [Wrapping new XSPEC models](@ref).
+
+Model data may also alternatively be copied in _by-hand_ from a HEASoft XSPEC source directory. In this case, the location to copy the data to may be determined via `joinpath(SpectralFitting.LibXSPEC_jll.artifact_dir, "spectral", "modelData")`.
+