Datasets must define a small API to make fitting possible. The picture to have in mind when considering the different domains is as follows: the model is trying to predict the objective. It does so by taking in input domain and maps it to some output domain.
That means make_output_domain and make_objective_domain correspond to the $(X,Y)$ values of the data that the model is trying to fit, whilst the model is evaluated on the make_model_domain, which need not be the same as the output domain.
In other cases, the objective_transformer acts to transform the output of the model onto the output domain.
Mathematically, expressing the output domain $X$, the model domain $D$, the model output $M(D)$ and objective $S$, along with the transformer as $T$, then the relationship between the different domains is
\[\hat{S} = T \times M(D),\]
Both $\hat{S}$ and $S$ are defined over $X$. The various fitting operations try to find model paramters that make $\hat{S}$ and $S$ as close as possible.
Abstract type for use in fitting routines. High level representation of some underlying data structures.
Fitting data is considered to have an objective and a domain. As the domain may be, for example, energy bins (high and low), or fourier frequencies (single value), the purpose of this abstraction is to provide some facility for translating between these representations for the models to fit with. This is done by checking that the AbstractLayout of the model and data are compatible, or at least have compatible translations.
Must implement a minimal set of accessor methods. These are paired with objective and domain parlance. Note that these functions are prefixed with make_* and not get_* to represent that there may be allocations or work going into the translation. Usage of these functions should be sparse in the interest of performance.
The arrays returned by the make_* functions must correspond to the AbstractLayout specified by the caller.
Finally, to make all of the fitting for different statistical regimes work efficiently, datasets should inform which units are preferred to fit. They may also give the error statistics they prefer, and a label name primarily used to disambiguate:
Returns the array used as the output domain. That is, in cases where the model input and output map to different domains, the input domain is said to be the model domain, the input domain is said to be the model domain.
The distinction is mainly used for the purposes of simulating data and for visualising data.
The data layout primarily concerns the relationship between the objective and the domain. It is used to work out whether a model and a dataset are fittable, and if not, whether a translation in the output of the model to the domain of the model is possible.
The following methods may be used to interrogate support:
preferred_support for inferring the preferred support of a model when multiple supports are possible.
common_support to obtain the common support of two structures
The following method is also used to define the support of a model or dataset:
For cases where unit information needs to be propagated, an AbstractLayout can also be used to ensure the units are compatible. To query the units of a layout, use
Used to define whether a given type has support for a specific AbstractLayout. Should return a tuple of the supported layouts. This method should be implemented to express new support, not the query method.