diff --git a/README.md b/README.md index c4aa9c6..f343b16 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ A light-weight interface for developers wanting to integrate machine learning models into -[MLJ](https://github.com/alan-turing-institute/MLJ.jl). +[MLJ](https://github.com/JuliaAI/MLJ.jl). | Linux | Coverage | @@ -12,8 +12,8 @@ machine learning models into [![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://juliaai.github.io/MLJModelInterface.jl/dev/) -[MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) is a framework for evaluating, +[MLJ](https://JuliaAI.github.io/MLJ.jl/dev/) is a framework for evaluating, combining and optimizing machine learning models in Julia. A third party package wanting to integrate their machine learning models into MLJ must import the module `MLJModelInterface` defined in this package, as described in the -[documentation](https://juliaai.github.io/MLJModelInterface.jl/dev/). +[documentation](https://JuliaAI.github.io/MLJModelInterface.jl/dev/). diff --git a/docs/src/document_strings.md b/docs/src/document_strings.md index 69d9a7f..134b1aa 100644 --- a/docs/src/document_strings.md +++ b/docs/src/document_strings.md @@ -29,23 +29,39 @@ Your document string must include the following components, in order: implementation. Generally, defer details on the role of hyperparameters to the "Hyperparameters" section (see below). -- Instructions on *how to import the model type* from MLJ (because a user can already inspect the doc-string in the Model Registry, without having loaded the code-providing package). +- Instructions on *how to import the model type* from MLJ (because a user can + already inspect the doc-string in the Model Registry, without having loaded + the code-providing package). - Instructions on *how to instantiate* with default hyperparameters or with keywords. -- A *Training data* section: explains how to bind a model to data in a machine with all possible signatures (eg, `machine(model, X, y)` but also `machine(model, X, y, w)` if, say, weights are supported); the role and scitype requirements for each data argument should be itemized. +- A *Training data* section: explains how to bind a model to data in a machine + with all possible signatures (eg, `machine(model, X, y)` but also + `machine(model, X, y, w)` if, say, weights are supported); the role and + scitype requirements for each data argument should be itemized. - Instructions on *how to fit* the machine (in the same section). - A *Hyperparameters* section (unless there aren't any): an itemized list of the parameters, with defaults given. -- An *Operations* section: each implemented operation (`predict`, `predict_mode`, `transform`, `inverse_transform`, etc ) is itemized and explained. This should include operations with no data arguments, such as `training_losses` and `feature_importances`. +- An *Operations* section: each implemented operation (`predict`, + `predict_mode`, `transform`, `inverse_transform`, etc ) is itemized and + explained. This should include operations with no data arguments, such as + `training_losses` and `feature_importances`. -- A *Fitted parameters* section: To explain what is returned by `fitted_params(mach)` (the same as `MLJModelInterface.fitted_params(model, fitresult)` - see later) with the fields of that named tuple itemized. +- A *Fitted parameters* section: To explain what is returned by `fitted_params(mach)` + (the same as `MLJModelInterface.fitted_params(model, fitresult)` - see later) + with the fields of that named tuple itemized. -- A *Report* section (if `report` is non-empty): To explain what, if anything, is included in the `report(mach)` (the same as the `report` return value of `MLJModelInterface.fit`) with the fields itemized. +- A *Report* section (if `report` is non-empty): To explain what, if anything, + is included in the `report(mach)` (the same as the `report` return value of + `MLJModelInterface.fit`) with the fields itemized. -- An optional but highly recommended *Examples* section, which includes MLJ examples, but which could also include others if the model type also implements a second "local" interface, i.e., defined in the same module. (Note that each module referring to a type can declare separate doc-strings which appear concatenated in doc-string queries.) +- An optional but highly recommended *Examples* section, which includes MLJ + examples, but which could also include others if the model type also + implements a second "local" interface, i.e., defined in the same module. (Note + that each module referring to a type can declare separate doc-strings which + appear concatenated in doc-string queries.) - A closing *"See also"* sentence which includes a `@ref` link to the raw model type (if you are wrapping one). diff --git a/docs/src/implementing_a_data_front_end.md b/docs/src/implementing_a_data_front_end.md index 6dfe1d9..9ca214c 100644 --- a/docs/src/implementing_a_data_front_end.md +++ b/docs/src/implementing_a_data_front_end.md @@ -84,10 +84,12 @@ Suppose a supervised model type `SomeSupervised` supports sample weights, leading to two different `fit` signatures, and that it has a single operation `predict`: - fit(model::SomeSupervised, verbosity, X, y) - fit(model::SomeSupervised, verbosity, X, y, w) +```julia +fit(model::SomeSupervised, verbosity, X, y) +fit(model::SomeSupervised, verbosity, X, y, w) - predict(model::SomeSupervised, fitresult, Xnew) +predict(model::SomeSupervised, fitresult, Xnew) +``` Without a data front-end implemented, suppose `X` is expected to be a table and `y` a vector, but suppose the core algorithm always converts @@ -95,19 +97,21 @@ table and `y` a vector, but suppose the core algorithm always converts a column in the table). Then a new data-front end might look like this: - constant MMI = MLJModelInterface - - # for fit: - MMI.reformat(::SomeSupervised, X, y) = (MMI.matrix(X)', y) - MMI.reformat(::SomeSupervised, X, y, w) = (MMI.matrix(X)', y, w) - MMI.selectrows(::SomeSupervised, I, Xmatrix, y) = - (view(Xmatrix, :, I), view(y, I)) - MMI.selectrows(::SomeSupervised, I, Xmatrix, y, w) = - (view(Xmatrix, :, I), view(y, I), view(w, I)) - - # for predict: - MMI.reformat(::SomeSupervised, X) = (MMI.matrix(X)',) - MMI.selectrows(::SomeSupervised, I, Xmatrix) = (view(Xmatrix, :, I),) +```julia +constant MMI = MLJModelInterface + +# for fit: +MMI.reformat(::SomeSupervised, X, y) = (MMI.matrix(X)', y) +MMI.reformat(::SomeSupervised, X, y, w) = (MMI.matrix(X)', y, w) +MMI.selectrows(::SomeSupervised, I, Xmatrix, y) = + (view(Xmatrix, :, I), view(y, I)) +MMI.selectrows(::SomeSupervised, I, Xmatrix, y, w) = + (view(Xmatrix, :, I), view(y, I), view(w, I)) + +# for predict: +MMI.reformat(::SomeSupervised, X) = (MMI.matrix(X)',) +MMI.selectrows(::SomeSupervised, I, Xmatrix) = (view(Xmatrix, :, I),) +``` With these additions, `fit` and `predict` are refactored, so that `X` and `Xnew` represent matrices with features as rows. diff --git a/docs/src/index.md b/docs/src/index.md index 941f1ed..07b319f 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -1,7 +1,7 @@ # Adding Models for General Use The machine learning tools provided by -[MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) can be applied to the models in +[MLJ](https://JuliaAI.github.io/MLJ.jl/dev/) can be applied to the models in any package that imports [MLJModelInterface](https://github.com/JuliaAI/MLJModelInterface.jl) and implements the API defined there, as outlined in this document. @@ -15,7 +15,7 @@ or by a stand-alone "interface-only" package, using the template [MLJExampleInterface.jl](https://github.com/JuliaAI/MLJExampleInterface.jl) (see [Where to place code implementing new models](@ref) below). For a list of packages implementing the MLJ model API (natively, and in interface packages) see -[here](https://alan-turing-institute.github.io/MLJ.jl/dev/list_of_supported_models/). +[here](https://JuliaAI.github.io/MLJ.jl/dev/list_of_supported_models/). ## Important @@ -31,7 +31,7 @@ project's [extras] and [targets]. In testing, simply use `MLJBase` in place of `MLJModelInterface`. It is assumed the reader has read the [Getting -Started](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/) section of +Started](https://JuliaAI.github.io/MLJ.jl/dev/getting_started/) section of the MLJ manual. To implement the API described here, some familiarity with the following packages is also helpful: @@ -52,5 +52,5 @@ packages is also helpful: In MLJ, the basic interface exposed to the user, built atop the model interface described here, is the *machine interface*. After a first reading of this document, the reader may wish to refer to [MLJ -Internals](https://alan-turing-institute.github.io/MLJ.jl/dev/internals/) for context. +Internals](https://JuliaAI.github.io/MLJ.jl/dev/internals/) for context. diff --git a/docs/src/iterative_models.md b/docs/src/iterative_models.md index 8600aba..590b2eb 100644 --- a/docs/src/iterative_models.md +++ b/docs/src/iterative_models.md @@ -18,11 +18,11 @@ If an MLJ `Machine` is being `fit!` and it is not the first time, then `update` instead of `fit`, unless the machine `fit!` has been called with a new `rows` keyword argument. However, `MLJModelInterface` defines a fallback for `update` which just calls `fit`. For context, see the -[Internals](https://alan-turing-institute.github.io/MLJ.jl/dev/internals/) section of the +[Internals](https://JuliaAI.github.io/MLJ.jl/dev/internals/) section of the MLJ manual. Learning networks wrapped as models constitute one use case (see the [Composing -Models](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/) section of +Models](https://JuliaAI.github.io/MLJ.jl/dev/composing_models/) section of the MLJ manual): one would like each component model to be retrained only when hyperparameter changes "upstream" make this necessary. In this case, MLJ provides a fallback (specifically, the fallback is for any subtype of `SupervisedNetwork = diff --git a/docs/src/quick_start_guide.md b/docs/src/quick_start_guide.md index e2990f4..1258759 100644 --- a/docs/src/quick_start_guide.md +++ b/docs/src/quick_start_guide.md @@ -18,11 +18,11 @@ understanding of how things work with MLJ. In particular, you are familiar with - [CategoricalArrays.jl](https://github.com/JuliaData/CategoricalArrays.jl), if working with finite discrete data, e.g., doing classification; see also the [Working with Categorical - Data](https://alan-turing-institute.github.io/MLJ.jl/dev/working_with_categorical_data/) + Data](https://JuliaAI.github.io/MLJ.jl/dev/working_with_categorical_data/) section of the MLJ manual. If you're not familiar with any one of these points, the [Getting -Started](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/) section of +Started](https://JuliaAI.github.io/MLJ.jl/dev/getting_started/) section of the MLJ manual may help. *But tables don't make sense for my model!* If a case can be made that @@ -99,8 +99,7 @@ Further to the last point, `a::Float64 = 0.5::(_ > 0)` indicates that the field `a` is a `Float64`, takes `0.5` as its default value, and expects its value to be positive. -Please see [this -issue](https://github.com/JuliaAI/MLJBase.jl/issues/68) +Please see [this issue](https://github.com/JuliaAI/MLJBase.jl/issues/68) for a known issue and workaround relating to the use of `@mlj_model` with negative defaults. @@ -201,7 +200,7 @@ For a classifier, the steps are fairly similar to a regressor with these differe 1. `y` will be a categorical vector and you will typically want to use the integer encoding of `y` instead of `CategoricalValue`s; use `MLJModelInterface.int` for this. -1. You will need to pass the full pool of target labels (not just +2. You will need to pass the full pool of target labels (not just those observed in the training data) and additionally, in the `Deterministic` case, the encoding, to make these available to `predict`. A simple way to do this is to pass `y[1]` in the @@ -210,19 +209,19 @@ For a classifier, the steps are fairly similar to a regressor with these differe method for recovering categorical elements from their integer representations (e.g., `d(2)` is the categorical element with `2` as encoding). -2. In the case of a *probabilistic* classifier you should pass all +3. In the case of a *probabilistic* classifier you should pass all probabilities simultaneously to the [`UnivariateFinite`](@ref) constructor to get an abstract `UnivariateFinite` vector (type `UnivariateFiniteArray`) rather than use comprehension or broadcasting to get a vanilla vector. This is for performance reasons. - + If implementing a classifier, you should probably consult the more detailed instructions at [The predict method](@ref). **Examples**: -- GLM's [BinaryClassifier](https://github.com/JuliaAI/MLJModels.jl/blob/3687491b132be8493b6f7a322aedf66008caaab1/src/GLM.jl#L119-L131) (`Probabilistic`) +- GLM's [BinaryClassifier](https://github.com/JuliaAI/MLJModels.jl/blob/3687491b132be8493b6f7a322aedf66008caaab1/src/GLM.jl#L119-L131) (`Probabilistic`) - LIBSVM's [SVC](https://github.com/JuliaAI/MLJModels.jl/blob/master/src/LIBSVM.jl) (`Deterministic`) @@ -273,7 +272,7 @@ implementation creates: affect the outcome of training. It is okay to add "control" parameters (such as specifying an `acceleration` parameter specifying computational resources, as - [here](https://github.com/alan-turing-institute/MLJ.jl/blob/master/src/ensembles.jl#L193)). + [here](https://github.com/JuliaAI/MLJ.jl/blob/master/src/ensembles.jl#L193)). - Use `report` to return *everything else*, including model-specific *methods* (or other callable objects). This includes feature rankings, decision boundaries, SVM support vectors, clustering centres, @@ -349,8 +348,8 @@ MLJModelInterface.metadata_model(YourModel1, output_scitype = MLJModelInterface.Table(MLJModelInterface.Continuous), # for an unsupervised, what output? supports_weights = false, # does the model support sample weights? descr = "A short description of your model" - load_path = "YourPackage.SubModuleContainingModelStructDefinition.YourModel1" - ) + load_path = "YourPackage.SubModuleContainingModelStructDefinition.YourModel1" +) ``` *Important.* Do not omit the `load_path` specification. Without a diff --git a/docs/src/serialization.md b/docs/src/serialization.md index a5f1ddb..fb448a1 100644 --- a/docs/src/serialization.md +++ b/docs/src/serialization.md @@ -10,7 +10,7 @@ implemented in languages other than Julia. The MLJ user can serialize and deserialize machines, as she would any other julia object. (This user has the option of first removing data from the machine. See the [Saving -machines](https://alan-turing-institute.github.io/MLJ.jl/dev/machines/#Saving-machines) +machines](https://JuliaAI.github.io/MLJ.jl/dev/machines/#Saving-machines) section of the MLJ manual for details.) However, a problem can occur if a model's `fitresult` (see [The fit method](@ref)) is not a persistent object. For example, it might be a C pointer that would have no meaning in a new Julia session. diff --git a/docs/src/static_models.md b/docs/src/static_models.md index 0d57ce4..e1a074e 100644 --- a/docs/src/static_models.md +++ b/docs/src/static_models.md @@ -2,7 +2,7 @@ A model type subtypes `Static <: Unsupervised` if it does not generalize to new data but nevertheless has hyperparameters. See the [Static -transformers](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/#Static-transformers) +transformers](https://JuliaAI.github.io/MLJ.jl/dev/transformers/#Static-transformers) section of the MLJ manual for examples. In the `Static` case, `transform` can have multiple arguments and `input_scitype` refers to the allowed scitype of the slurped data, *even if there is only a single argument.* For example, if the signature is diff --git a/docs/src/summary_of_methods.md b/docs/src/summary_of_methods.md index 05f710e..e942130 100644 --- a/docs/src/summary_of_methods.md +++ b/docs/src/summary_of_methods.md @@ -43,11 +43,11 @@ Optional, if `SomeSupervisedModel <: Probabilistic`: ```julia MMI.predict_mode(model::SomeSupervisedModel, fitresult, Xnew) = - mode.(predict(model, fitresult, Xnew)) + mode.(predict(model, fitresult, Xnew)) MMI.predict_mean(model::SomeSupervisedModel, fitresult, Xnew) = - mean.(predict(model, fitresult, Xnew)) + mean.(predict(model, fitresult, Xnew)) MMI.predict_median(model::SomeSupervisedModel, fitresult, Xnew) = - median.(predict(model, fitresult, Xnew)) + median.(predict(model, fitresult, Xnew)) ``` Required, if the model is to be registered (findable by general users): diff --git a/docs/src/supervised_models.md b/docs/src/supervised_models.md index dbb4d42..e61ce65 100644 --- a/docs/src/supervised_models.md +++ b/docs/src/supervised_models.md @@ -19,15 +19,15 @@ The following sections were written with `Supervised` models in mind, but also c material relevant to general models: - [Summary of methods](@ref) -- [The form of data for fitting and predicting](@ref) +- [The form of data for fitting and predicting](@ref) - [The fit method](@ref) - [The fitted_params method](@ref) -- [The predict method](@ref) -- [The predict_joint method](@ref) -- [Training losses](@ref) -- [Feature importances](@ref) -- [Trait declarations](@ref) -- [Iterative models and the update! method](@ref) -- [Implementing a data front end](@ref) -- [Supervised models with a transform method](@ref) +- [The predict method](@ref) +- [The predict_joint method](@ref) +- [Training losses](@ref) +- [Feature importances](@ref) +- [Trait declarations](@ref) +- [Iterative models and the update! method](@ref) +- [Implementing a data front end](@ref) +- [Supervised models with a transform method](@ref) - [Models that learn a probability distribution](@ref) diff --git a/docs/src/the_fit_method.md b/docs/src/the_fit_method.md index 920ccc3..7dd587b 100644 --- a/docs/src/the_fit_method.md +++ b/docs/src/the_fit_method.md @@ -7,21 +7,21 @@ MMI.fit(model::SomeSupervisedModel, verbosity, X, y) -> fitresult, cache, report ``` 1. `fitresult` is the fitresult in the sense above (which becomes an - argument for `predict` discussed below). + argument for `predict` discussed below). 2. `report` is a (possibly empty) `NamedTuple`, for example, - `report=(deviance=..., dof_residual=..., stderror=..., vcov=...)`. - Any training-related statistics, such as internal estimates of the - generalization error, and feature rankings, should be returned in - the `report` tuple. How, or if, these are generated should be - controlled by hyperparameters (the fields of `model`). Fitted - parameters, such as the coefficients of a linear model, do not go - in the report as they will be extractable from `fitresult` (and - accessible to MLJ through the `fitted_params` method described below). - -3. The value of `cache` can be `nothing`, unless one is also defining - an `update` method (see below). The Julia type of `cache` is not - presently restricted. + `report=(deviance=..., dof_residual=..., stderror=..., vcov=...)`. + Any training-related statistics, such as internal estimates of the + generalization error, and feature rankings, should be returned in + the `report` tuple. How, or if, these are generated should be + controlled by hyperparameters (the fields of `model`). Fitted + parameters, such as the coefficients of a linear model, do not go + in the report as they will be extractable from `fitresult` (and + accessible to MLJ through the `fitted_params` method described below). + +3. The value of `cache` can be `nothing`, unless one is also defining + an `update` method (see below). The Julia type of `cache` is not + presently restricted. !!! note diff --git a/docs/src/the_predict_method.md b/docs/src/the_predict_method.md index 84fb49b..15c29bf 100644 --- a/docs/src/the_predict_method.md +++ b/docs/src/the_predict_method.md @@ -6,8 +6,7 @@ A compulsory `predict` method has the form MMI.predict(model::SomeSupervisedModel, fitresult, Xnew) -> yhat ``` -Here `Xnew` will have the same form as the `X` passed to -`fit`. +Here `Xnew` will have the same form as the `X` passed to `fit`. Note that while `Xnew` generally consists of multiple observations (e.g., has multiple rows in the case of a table) it is assumed, in view of @@ -44,16 +43,16 @@ may look something like this: ```julia function MMI.fit(model::SomeSupervisedModel, verbosity, X, y) - yint = MMI.int(y) - a_target_element = y[1] # a CategoricalValue/String - decode = MMI.decoder(a_target_element) # can be called on integers + yint = MMI.int(y) + a_target_element = y[1] # a CategoricalValue/String + decode = MMI.decoder(a_target_element) # can be called on integers - core_fitresult = SomePackage.fit(X, yint, verbosity=verbosity) + core_fitresult = SomePackage.fit(X, yint, verbosity=verbosity) - fitresult = (decode, core_fitresult) - cache = nothing - report = nothing - return fitresult, cache, report + fitresult = (decode, core_fitresult) + cache = nothing + report = nothing + return fitresult, cache, report end ``` @@ -61,9 +60,9 @@ while a corresponding deterministic `predict` operation might look like this: ```julia function MMI.predict(model::SomeSupervisedModel, fitresult, Xnew) - decode, core_fitresult = fitresult - yhat = SomePackage.predict(core_fitresult, Xnew) - return decode.(yhat) + decode, core_fitresult = fitresult + yhat = SomePackage.predict(core_fitresult, Xnew) + return decode.(yhat) end ``` @@ -155,8 +154,8 @@ yhat = MLJModelInterface.UnivariateFinite([:FALSE, :TRUE], probs, augment=true, ``` The constructor has a lot of options, including passing a dictionary -instead of vectors. See -`CategoricalDistributions.UnivariateFinite`](@ref) for details. +instead of vectors. See [`CategoricalDistributions.UnivariateFinite`](@ref) +for details. See [LinearBinaryClassifier](https://github.com/JuliaAI/MLJModels.jl/blob/master/src/GLM.jl) diff --git a/docs/src/trait_declarations.md b/docs/src/trait_declarations.md index 2fd3614..73d6649 100644 --- a/docs/src/trait_declarations.md +++ b/docs/src/trait_declarations.md @@ -27,8 +27,7 @@ MMI.input_scitype(::Type{<:DecisionTreeClassifier}) = Table(Continuous) ``` If, instead, columns were allowed to have either: (i) a mixture of `Continuous` and `Missing` -values, or (ii) `Count` (i.e., integer) values, then the -declaration would be +values, or (ii) `Count` (i.e., integer) values, then the declaration would be ```julia MMI.input_scitype(::Type{<:DecisionTreeClassifier}) = Table(Union{Continuous,Missing},Count) @@ -128,7 +127,7 @@ MMI.metadata_model( *Important.* Do not omit the `load_path` specification. If unsure what it should be, post an issue at -[MLJ](https://github.com/alan-turing-institute/MLJ.jl/issues). +[MLJ](https://github.com/JuliaAI/MLJ.jl/issues). ```@docs MMI.metadata_pkg diff --git a/docs/src/type_declarations.md b/docs/src/type_declarations.md index d9e1651..85c0687 100644 --- a/docs/src/type_declarations.md +++ b/docs/src/type_declarations.md @@ -8,32 +8,32 @@ import MLJModelInterface const MMI = MLJModelInterface mutable struct RidgeRegressor <: MMI.Deterministic - lambda::Float64 + lambda::Float64 end ``` -Models (which are mutable) should not be given internal -constructors. It is recommended that they be given an external lazy -keyword constructor of the same name. This constructor defines default values -for every field, and optionally corrects invalid field values by calling a -`clean!` method (whose fallback returns an empty message string): +Models (which are mutable) should not be given internal constructors. +It is recommended that they be given an external lazy keyword constructor +of the same name. This constructor defines default values for every field, +and optionally corrects invalid field values by calling a `clean!` +method (whose fallback returns an empty message string): ```julia function MMI.clean!(model::RidgeRegressor) - warning = "" - if model.lambda < 0 - warning *= "Need lambda ≥ 0. Resetting lambda=0. " - model.lambda = 0 - end - return warning + warning = "" + if model.lambda < 0 + warning *= "Need lambda ≥ 0. Resetting lambda=0. " + model.lambda = 0 + end + return warning end # keyword constructor function RidgeRegressor(; lambda=0.0) - model = RidgeRegressor(lambda) - message = MMI.clean!(model) - isempty(message) || @warn message - return model + model = RidgeRegressor(lambda) + message = MMI.clean!(model) + isempty(message) || @warn message + return model end ``` @@ -56,7 +56,7 @@ of `nothing`. ### Hyperparameters for parallelization options The section [Acceleration and -Parallelism](https://alan-turing-institute.github.io/MLJ.jl/dev/acceleration_and_parallelism/) +Parallelism](https://JuliaAI.github.io/MLJ.jl/dev/acceleration_and_parallelism/) of the MLJ manual indicates how users specify an option to run an algorithm using distributed processing or multithreading. A hyperparameter specifying such an option should be called `acceleration`. Its value `a` should satisfy `a isa AbstractResource` @@ -66,7 +66,7 @@ run on a GPU is ordinarily indicated with the `CUDALibs()` resource. ### hyperparameter access and mutation To support hyperparameter optimization (see the [Tuning -Models](https://alan-turing-institute.github.io/MLJ.jl/dev/tuning_models/) section of the +Models](https://JuliaAI.github.io/MLJ.jl/dev/tuning_models/) section of the MLJ manual) any hyperparameter to be individually controlled must be: - property-accessible; nested property access allowed, as in @@ -96,8 +96,8 @@ following example: ```julia @mlj_model mutable struct YourModel <: MMI.Deterministic - a::Float64 = 0.5::(_ > 0) - b::String = "svd"::(_ in ("svd","qr")) + a::Float64 = 0.5::(_ > 0) + b::String = "svd"::(_ in ("svd","qr")) end ``` @@ -115,7 +115,7 @@ expects its value to be positive. You cannot use the `@mlj_model` macro if your model struct has type parameters. -#### Known issue with @mlj_macro +#### Known issue with `@mlj_macro` Defaults with negative values can trip up the `@mlj_macro` (see [this issue](https://github.com/JuliaAI/MLJBase.jl/issues/68)). So, @@ -123,7 +123,7 @@ for example, this does not work: ```julia @mlj_model mutable struct Bar - a::Int = -1::(_ > -2) + a::Int = -1::(_ > -2) end ``` @@ -131,6 +131,6 @@ But this does: ```julia @mlj_model mutable struct Bar - a::Int = (-)(1)::(_ > -2) + a::Int = (-)(1)::(_ > -2) end ``` diff --git a/docs/src/unsupervised_models.md b/docs/src/unsupervised_models.md index 09d8426..cb0dbbf 100644 --- a/docs/src/unsupervised_models.md +++ b/docs/src/unsupervised_models.md @@ -17,7 +17,7 @@ similar fashion. The main differences are: Static`, in which case there is no restriction. A use-case for `predict` is K-means clustering that `predict`s labels and `transform`s input features into a space of lower dimension. See the [Transformers that also - predict](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/#Transformers-that-also-predict) + predict](https://JuliaAI.github.io/MLJ.jl/dev/transformers/#Transformers-that-also-predict) section of the MLJ manual for an example. - The `target_scitype` refers to the output of `predict`, if implemented. A new trait, @@ -31,9 +31,9 @@ similar fashion. The main differences are: is the same as `transform`, as in `MLJModelInterface.inverse_transform(model, fitresult, Xout)`, which: - must make sense for any `Xout` for which `scitype(Xout) <: - output_scitype(SomeSupervisedModel)` (see below); and + output_scitype(SomeSupervisedModel)` (see below); and - must return an object `Xin` satisfying `scitype(Xin) <: - input_scitype(SomeSupervisedModel)`. + input_scitype(SomeSupervisedModel)`. For sample implementatations, see MLJ's [built-in transformers](https://github.com/JuliaAI/MLJModels.jl/blob/dev/src/builtins/Transformers.jl) diff --git a/docs/src/where_to_put_code.md b/docs/src/where_to_put_code.md index ed3f231..8697abb 100644 --- a/docs/src/where_to_put_code.md +++ b/docs/src/where_to_put_code.md @@ -12,7 +12,7 @@ to all MLJ users: algorithms implementing the interface. An example is [`EvoTrees.jl`](https://github.com/Evovest/EvoTrees.jl/blob/master/src/MLJ.jl). In this case, it is sufficient to open an issue at - [MLJ](https://github.com/alan-turing-institute/MLJ.jl) requesting + [MLJ](https://github.com/JuliaAI/MLJ.jl) requesting the package to be registered with MLJ. Registering a package allows the MLJ user to access its models' metadata and to selectively load them. diff --git a/src/data_utils.jl b/src/data_utils.jl index 270f344..5cb4104 100644 --- a/src/data_utils.jl +++ b/src/data_utils.jl @@ -10,7 +10,7 @@ function errlight(s) ) end -## Internal function to be extended in MLJBase (so do not export) +## Internal function to be extended in MLJBase (so do not export) vtrait(X, s="") = vtrait(get_interface_mode(), X, s) vtrait(::LightInterface, X, s) = errlight(s) @@ -33,7 +33,7 @@ If `X isa AbstractMatrix`, return `X` or `permutedims(X)` if `transpose=true`. Otherwise if `X` is a Tables.jl compatible table source, convert `X` into a `Matrix`. """ -function matrix(X; kw...) +function matrix(X; kw...) m = get_interface_mode() return matrix(m, vtrait(m, X, "matrix"), X; kw...) end @@ -58,7 +58,7 @@ end # int """ - int(x) + int(x) The positional integer of the `CategoricalString` or `CategoricalValue` `x`, in the ordering defined by the pool of `x`. The type of `int(x)` is the reference @@ -73,7 +73,7 @@ of `x`, but has the same type. Broadcasted versions of `int`. -```julia +```julia-repl julia> v = categorical(["c", "b", "c", "a"]) 4-element CategoricalArrays.CategoricalArray{String,1,UInt32}: "c" @@ -117,8 +117,8 @@ classes(x)` is always true. Not to be confused with `levels(x.pool)`. See the example below. -```julia -julia> v = categorical(["c", "b", "c", "a"]) +```julia-repl +julia> v = categorical(["c", "b", "c", "a"]) 4-element CategoricalArrays.CategoricalArray{String,1,UInt32}: "c" "b" @@ -161,7 +161,7 @@ The scientific type (interpretation) of `X`, distinct from its machine type. ### Examples -```julia +```julia-repl julia> scitype(3.14) Continuous @@ -174,7 +174,7 @@ Tuple{Count, Textual} julia> using CategoricalArrays julia> X = (gender = categorical(['M', 'M', 'F', 'M', 'F']), - ndevices = [1, 3, 2, 3, 2]); + ndevices = [1, 3, 2, 3, 2]); julia> scitype(X) Table{Union{AbstractVector{Count}, AbstractVector{Multiclass{2}}}} @@ -182,9 +182,7 @@ Table{Union{AbstractVector{Count}, AbstractVector{Multiclass{2}}}} """ scitype(X) = scitype(get_interface_mode(), vtrait(X, "scitype"), X) -function scitype(::LightInterface, m, X) - return errlight("scitype") -end +scitype(::LightInterface, m, X) = errlight("scitype") # ------------------------------------------------------------------------ # schema @@ -197,9 +195,7 @@ returns `nothing` if the column types and scitypes can't be inspected. """ schema(X) = schema(get_interface_mode(), vtrait(X, "schema"), X) -function schema(::LightInterface, m, X) - return errlight("schema") -end +schema(::LightInterface, m, X) = errlight("schema") # ------------------------------------------------------------------------ # istable @@ -232,7 +228,7 @@ Return a callable object for decoding the integer representation of a broadcast over all elements. ### Examples -```julia +```julia-repl julia> v = categorical(["c", "b", "c", "a"]) 4-element CategoricalArrays.CategoricalArray{String,1,UInt32}: "c" @@ -439,153 +435,152 @@ _squeeze(v) = first(v) # ------------------------------------------------------------------------ # UnivariateFinite -const UNIVARIATE_FINITE_DOCSTRING = - """ - UnivariateFinite( - support, - probs; - pool=nothing, - augmented=false, - ordered=false - ) +""" + UnivariateFinite( + support, + probs; + pool=nothing, + augmented=false, + ordered=false + ) + +Construct a discrete univariate distribution whose finite support is +the elements of the vector `support`, and whose corresponding +probabilities are elements of the vector `probs`. Alternatively, +construct an abstract *array* of `UnivariateFinite` distributions by +choosing `probs` to be an array of one higher dimension than the array +generated. + +Here the word "probabilities" is an abuse of terminology as there is +no requirement that probabilities actually sum to one, only that they +be non-negative. So `UnivariateFinite` objects actually implement +arbitrary non-negative measures over finite sets of labelled points. A +`UnivariateDistribution` will be a bona fide probability measure when +constructed using the `augment=true` option (see below) or when +`fit` to data. + +Unless `pool` is specified, `support` should have type +`AbstractVector{<:CategoricalValue}` and all elements are assumed to +share the same categorical pool, which may be larger than `support`. + +*Important.* All levels of the common pool have associated +probabilities, not just those in the specified `support`. However, +these probabilities are always zero (see example below). + +If `probs` is a matrix, it should have a column for each class in +`support` (or one less, if `augment=true`). More generally, `probs` +will be an array whose size is of the form `(n1, n2, ..., nk, c)`, +where `c = length(support)` (or one less, if `augment=true`) and the +constructor then returns an array of `UnivariateFinite` distributions +of size `(n1, n2, ..., nk)`. + +## Examples + +```julia-repl +julia> v = categorical(["x", "x", "y", "x", "z"]) +5-element CategoricalArrays.CategoricalArray{String,1,UInt32}: + "x" + "x" + "y" + "x" + "z" + +julia> UnivariateFinite(classes(v), [0.2, 0.3, 0.5]) +UnivariateFinite{Multiclass{3}}(x=>0.2, y=>0.3, z=>0.5) + +julia> d = UnivariateFinite([v[1], v[end]], [0.1, 0.9]) +UnivariateFinite{Multiclass{3}}(x=>0.1, z=>0.9) + +julia> rand(d, 3) +3-element CategoricalArrays.CategoricalArray{String,1,UInt32}: + "x" + "z" + "x" + +julia> levels(d) +3-element Vector{String}: + "x" + "y" + "z" + +julia> pdf(d, "y") +0.0 + +``` + +### Specifying a pool - Construct a discrete univariate distribution whose finite support is - the elements of the vector `support`, and whose corresponding - probabilities are elements of the vector `probs`. Alternatively, - construct an abstract *array* of `UnivariateFinite` distributions by - choosing `probs` to be an array of one higher dimension than the array - generated. - - Here the word "probabilities" is an abuse of terminology as there is - no requirement that probabilities actually sum to one, only that they - be non-negative. So `UnivariateFinite` objects actually implement - arbitrary non-negative measures over finite sets of labelled points. A - `UnivariateDistribution` will be a bona fide probability measure when - constructed using the `augment=true` option (see below) or when - `fit` to data. - - Unless `pool` is specified, `support` should have type - `AbstractVector{<:CategoricalValue}` and all elements are assumed to - share the same categorical pool, which may be larger than `support`. - - *Important.* All levels of the common pool have associated - probabilities, not just those in the specified `support`. However, - these probabilities are always zero (see example below). - - If `probs` is a matrix, it should have a column for each class in - `support` (or one less, if `augment=true`). More generally, `probs` - will be an array whose size is of the form `(n1, n2, ..., nk, c)`, - where `c = length(support)` (or one less, if `augment=true`) and the - constructor then returns an array of `UnivariateFinite` distributions - of size `(n1, n2, ..., nk)`. - - ## Examples - - ```julia - julia> v = categorical(["x", "x", "y", "x", "z"]) - 5-element CategoricalArrays.CategoricalArray{String,1,UInt32}: - "x" - "x" - "y" - "x" - "z" - - julia> UnivariateFinite(classes(v), [0.2, 0.3, 0.5]) - UnivariateFinite{Multiclass{3}}(x=>0.2, y=>0.3, z=>0.5) - - julia> d = UnivariateFinite([v[1], v[end]], [0.1, 0.9]) - UnivariateFinite{Multiclass{3}}(x=>0.1, z=>0.9) - - julia> rand(d, 3) - 3-element CategoricalArrays.CategoricalArray{String,1,UInt32}: - "x" - "z" - "x" - - julia> levels(d) - 3-element Vector{String}: - "x" - "y" - "z" - - julia> pdf(d, "y") - 0.0 - - ``` - - ### Specifying a pool - - Alternatively, `support` may be a list of raw (non-categorical) - elements if `pool` is: - - - some `CategoricalArray`, `CategoricalValue` or `CategoricalPool`, - such that `support` is a subset of `levels(pool)` - - - `missing`, in which case a new categorical pool is created which has - `support` as its only levels. - - In the last case, specify `ordered=true` if the pool is to be - considered ordered. - - ```julia - julia> UnivariateFinite(["x", "z"], [0.1, 0.9], pool=missing, ordered=true) - UnivariateFinite{OrderedFactor{2}}(x=>0.1, z=>0.9) - - julia> d = UnivariateFinite(["x", "z"], [0.1, 0.9], pool=v) # v defined above - UnivariateFinite{Multiclass{3}}(x=>0.1, z=>0.9) - - julia> pdf(d, "y") # allowed as `"y" in levels(v)` - 0.0 - - julia> v = categorical(["x", "x", "y", "x", "z", "w"]) - 6-element CategoricalArrays.CategoricalArray{String,1,UInt32}: - "x" - "x" - "y" - "x" - "z" - "w" - - julia> probs = rand(100, 3); probs = probs ./ sum(probs, dims=2); - - julia> UnivariateFinite(["x", "y", "z"], probs, pool=v) - 100-element UnivariateFiniteVector{Multiclass{4}, String, UInt32, Float64}: - UnivariateFinite{Multiclass{4}}(x=>0.194, y=>0.3, z=>0.505) - UnivariateFinite{Multiclass{4}}(x=>0.727, y=>0.234, z=>0.0391) - UnivariateFinite{Multiclass{4}}(x=>0.674, y=>0.00535, z=>0.321) - ⋮ - UnivariateFinite{Multiclass{4}}(x=>0.292, y=>0.339, z=>0.369) - ``` - - ### Probability augmentation - - If `augment=true` the provided array is augmented by inserting - appropriate elements *ahead* of those provided, along the last - dimension of the array. This means the user only provides probabilities - for the classes `c2, c3, ..., cn`. The class `c1` probabilities are - chosen so that each `UnivariateFinite` distribution in the returned - array is a bona fide probability distribution. +Alternatively, `support` may be a list of raw (non-categorical) +elements if `pool` is: - --- +- some `CategoricalArray`, `CategoricalValue` or `CategoricalPool`, + such that `support` is a subset of `levels(pool)` - UnivariateFinite(prob_given_class; pool=nothing, ordered=false) +- `missing`, in which case a new categorical pool is created which has + `support` as its only levels. - Construct a discrete univariate distribution whose finite support is - the set of keys of the provided dictionary, `prob_given_class`, and - whose values specify the corresponding probabilities. +In the last case, specify `ordered=true` if the pool is to be +considered ordered. - The type requirements on the keys of the dictionary are the same as - the elements of `support` given above with this exception: if - non-categorical elements (raw labels) are used as keys, then - `pool=...` must be specified and cannot be `missing`. +```julia-repl +julia> UnivariateFinite(["x", "z"], [0.1, 0.9], pool=missing, ordered=true) +UnivariateFinite{OrderedFactor{2}}(x=>0.1, z=>0.9) - If the values (probabilities) are arrays instead of scalars, then an - abstract array of `UnivariateFinite` elements is created, with the - same size as the array. +julia> d = UnivariateFinite(["x", "z"], [0.1, 0.9], pool=v) # v defined above +UnivariateFinite{Multiclass{3}}(x=>0.1, z=>0.9) - """ +julia> pdf(d, "y") # allowed as `"y" in levels(v)` +0.0 + +julia> v = categorical(["x", "x", "y", "x", "z", "w"]) +6-element CategoricalArrays.CategoricalArray{String,1,UInt32}: + "x" + "x" + "y" + "x" + "z" + "w" + +julia> probs = rand(100, 3); probs = probs ./ sum(probs, dims=2); + +julia> UnivariateFinite(["x", "y", "z"], probs, pool=v) +100-element UnivariateFiniteVector{Multiclass{4}, String, UInt32, Float64}: + UnivariateFinite{Multiclass{4}}(x=>0.194, y=>0.3, z=>0.505) + UnivariateFinite{Multiclass{4}}(x=>0.727, y=>0.234, z=>0.0391) + UnivariateFinite{Multiclass{4}}(x=>0.674, y=>0.00535, z=>0.321) + ⋮ + UnivariateFinite{Multiclass{4}}(x=>0.292, y=>0.339, z=>0.369) +``` + +### Probability augmentation + +If `augment=true` the provided array is augmented by inserting +appropriate elements *ahead* of those provided, along the last +dimension of the array. This means the user only provides probabilities +for the classes `c2, c3, ..., cn`. The class `c1` probabilities are +chosen so that each `UnivariateFinite` distribution in the returned +array is a bona fide probability distribution. + +--- + + UnivariateFinite(prob_given_class; pool=nothing, ordered=false) + +Construct a discrete univariate distribution whose finite support is +the set of keys of the provided dictionary, `prob_given_class`, and +whose values specify the corresponding probabilities. + +The type requirements on the keys of the dictionary are the same as +the elements of `support` given above with this exception: if +non-categorical elements (raw labels) are used as keys, then +`pool=...` must be specified and cannot be `missing`. + +If the values (probabilities) are arrays instead of scalars, then an +abstract array of `UnivariateFinite` elements is created, with the +same size as the array. + +""" +function UnivariateFinite end # method-less impl for docstring -@doc UNIVARIATE_FINITE_DOCSTRING function UnivariateFinite(d::AbstractDict; kwargs...) return UnivariateFinite(get_interface_mode(), d; kwargs...) end diff --git a/src/equality.jl b/src/equality.jl index c183c4e..21d556a 100644 --- a/src/equality.jl +++ b/src/equality.jl @@ -89,7 +89,7 @@ following conditions all hold, and `false` otherwise: The meaining of "equal" depends on the type of the property value: - values that are themselves of `MLJType` are "equal" if they are -equal in the sense of `is_same_except` with no exceptions. + equal in the sense of `is_same_except` with no exceptions. - values that are not of `MLJType` are "equal" if they are `==`. diff --git a/src/metadata_utils.jl b/src/metadata_utils.jl index 4253a27..781adce 100644 --- a/src/metadata_utils.jl +++ b/src/metadata_utils.jl @@ -182,7 +182,7 @@ particular, `package_name` and `package_url`) to be defined; see also Suppose a model type and traits have been defined by: -``` +```julia mutable struct FooRegressor a::Int b::Float64 diff --git a/src/model_api.jl b/src/model_api.jl index b1390a6..5f62f91 100644 --- a/src/model_api.jl +++ b/src/model_api.jl @@ -60,10 +60,10 @@ repeating such transformations unnecessarily, and can additionally make use of more efficient row subsampling, which is then based on the model-specific representation of data, rather than the user-representation. When `reformat` is overloaded, -`selectrows(::Model, ...)` must be as well (see -[`selectrows`](@ref)). Furthermore, the model `fit` method(s), and -operations, such as `predict` and `transform`, must be refactored to -act on the model-specific representations of the data. +`selectrows(::Model, ...)` must be as well (see [`selectrows`](@ref)). +Furthermore, the model `fit` method(s), and operations, such as +`predict` and `transform`, must be refactored to act on +the model-specific representations of the data. To implement the `reformat` data front-end for a model, refer to "Implementing a data front-end" in the [MLJ @@ -208,15 +208,15 @@ supports reporting. Overloading this method is optional, unless the model generates reports that are neither named tuples nor `nothing`. -Assuming each value in the `report_given_method` dictionary is either a named tuple -or `nothing`, and there are no conflicts between the keys of the dictionary values -(the individual reports), the fallback returns the usual named tuple merge of the -dictionary values, ignoring any `nothing` value. If there is a key conflict, all operation -reports are first wrapped in a named -tuple of length one, as in `(predict=predict_report,)`. A `:fit` report is never wrapped. +Assuming each value in the `report_given_method` dictionary is either a named tuple +or `nothing`, and there are no conflicts between the keys of the dictionary values +(the individual reports), the fallback returns the usual named tuple merge of +the dictionary values, ignoring any `nothing` value. If there is a key conflict, +all operation reports are first wrapped in a named tuple of length one, +as in `(predict=predict_report,)`. A `:fit` report is never wrapped. -If any dictionary `value` is neither a named tuple nor `nothing`, it is first wrapped as -`(report=value, )` before merging. +If any dictionary `value` is neither a named tuple nor `nothing`, it is first +wrapped as `(report=value, )` before merging. """ function report(model, report_given_method) diff --git a/src/model_def.jl b/src/model_def.jl index e90d70a..337d241 100644 --- a/src/model_def.jl +++ b/src/model_def.jl @@ -22,7 +22,7 @@ When no default field value is given a heuristic is to guess an appropriate default (eg, zero for a `Float64` parameter). To this end, the specified type expression is evaluated in the module `modl`. - """ +""" function _process_model_def(modl, ex) defaults = Dict{Symbol, Any}() constraints = Dict{Symbol, Any}() @@ -47,7 +47,7 @@ function _process_model_def(modl, ex) # # where line.args[1] will either be just `name` or `name::Type` # and line.args[2] will either be just `value` or `value::constraint` - # ----------------------------------------------------------------------------- + # --------------------------------------------------------------------- # 1. decompose `line.args[1]` appropriately (name and type) if line.args[1] isa Symbol # case :a param = line.args[1] @@ -56,7 +56,7 @@ function _process_model_def(modl, ex) param, type = line.args[1].args[1:2] # (:a, Int) end push!(params, param) - + # ------------------------------------------------------------------ # 2. decompose `line.args[2]` appropriately (values and constraints) if line.head == :(=) # assignment for default @@ -70,10 +70,10 @@ function _process_model_def(modl, ex) defaults[param] = default # name or name::Type (for the constructor) - ex.args[3].args[i] = line.args[1] + ex.args[3].args[i] = line.args[1] else # these are simple heuristics when no default value is given for the - # field but an "obvious" one can be provided implicitly + # field but an "obvious" one can be provided implicitly # (ideally this should not be used as it's not very clear # that the intention matches the usage) eff_type = modl.eval(type) @@ -142,10 +142,10 @@ function _model_constructor(modelname, params, defaults) :block, Expr(:(=), :model, Expr(:call, :new, params...)), :(message = $MLJModelInterface.clean!(model)), - :(isempty(message) || @warn message), - :(return model) - ) - ) + :(isempty(message) || @warn message), + :(return model) + ) + ) end @@ -201,11 +201,11 @@ Macro to help define MLJ models with constraints on the default parameters. """ macro mlj_model(ex) ex, modelname, params, defaults, constraints = _process_model_def(__module__, ex) - # keyword constructor + # keyword constructor const_ex = _model_constructor(modelname, params, defaults) - # associate the constructor with the definition of the struct + # associate the constructor with the definition of the struct push!(ex.args[3].args, const_ex) - # cleaner + # cleaner clean_ex = _model_cleaner(modelname, defaults, constraints) esc( quote diff --git a/src/parameter_inspection.jl b/src/parameter_inspection.jl index bc5d2ed..796b416 100644 --- a/src/parameter_inspection.jl +++ b/src/parameter_inspection.jl @@ -12,14 +12,14 @@ values, which themselves might be transparent. Most objects of type `MLJType` are transparent. -```julia +```julia-repl julia> params(EnsembleModel(model=ConstantClassifier())) (model = (target_type = Bool,), -weights = Float64[], -bagging_fraction = 0.8, -rng_seed = 0, -n = 100, -parallel = true,) + weights = Float64[], + bagging_fraction = 0.8, + rng_seed = 0, + n = 100, + parallel = true,) ``` """ params(m) = params(m, Val(istransparent(m))) @@ -41,10 +41,10 @@ names. Properties of nested model instances are recursively exposed,.as shown in example below. For most `Model` objects, properties are synonymous with fields, but this is not a hard requirement. -```julia -using MLJModels -using EnsembleModels -tree = (@load DecisionTreeClassifier pkg=DecisionTree) +```julia-repl +julia> using MLJModels +julia> using EnsembleModels +julia> tree = (@load DecisionTreeClassifier pkg=DecisionTree)(); julia> flat_params(EnsembleModel(model=tree)) (model__max_depth = -1,