diff --git a/docs/src/MLJ_interface.md b/docs/src/MLJ_interface.md
index 76d29794..911e10fe 100644
--- a/docs/src/MLJ_interface.md
+++ b/docs/src/MLJ_interface.md
@@ -1,7 +1,7 @@
 # [The MLJ interface to BetaML Models](@id bmlj_module)
 
 ```@docs
-Utils
+Bmlj
 
 ```
 ## Models available through MLJ
diff --git a/docs/src/StyleGuide_templates.md b/docs/src/StyleGuide_templates.md
index 1667b5d6..8a1ad900 100644
--- a/docs/src/StyleGuide_templates.md
+++ b/docs/src/StyleGuide_templates.md
@@ -115,10 +115,10 @@ Detailed description on the module objectives, content and organisation
 
 ## Internal links
 
-To refer to a documented object: `[\`NAME\`](@ref)` or `[\`NAME\`](@ref manual_id)`.
-In particular for internal links use `[\`?NAME\`](@ref ?NAME)`
+To refer to a documented object: ```[`NAME`](@ref)``` or ```[`NAME`](@ref manual_id)```.
+In particular for internal links use ```[`?NAME`](@ref ?NAME)```
 
-To create a id manually: `[Title](@id manual_id)`.
+To create an id manually: ```[Title](@id manual_id)```
 
 ## Data organisation
 
diff --git a/docs/src/index.md b/docs/src/index.md
index cc2bbc14..c643b100 100644
--- a/docs/src/index.md
+++ b/docs/src/index.md
@@ -31,13 +31,13 @@ res     = KernelPerceptronClassifier() # KernelPerceptronClassifier is defined i
 ```
 Each module is documented on the links below (you can also use the inline Julia help system: just press the question mark `?` and then, on the special help prompt `help?>`, type the function name):
 
-- [**`BetaML.Perceptron`**](Perceptron.html): The Perceptron, Kernel Perceptron and Pegasos classification algorithms;
-- [**`BetaML.Trees`**](Trees.html): The Decision Trees and Random Forests algorithms for classification or regression (with missing values supported);
-- [**`BetaML.Nn`**](Nn.html): Implementation of Artificial Neural Networks;
-- [**`BetaML.Clustering`**](Clustering.html): (hard) Clustering algorithms (K-Means, K-Mdedoids)
-- [**`BetaML.GMM`**](GMM.html): Various algorithms (Clustering, regressor, missing imputation / collaborative filtering / recommandation systems) that use a Generative (Gaussian) mixture models (probabilistic) fitter, fitted using a EM algorithm;
-- [**`BetaML.Imputation`**](Imputation.html): Imputation algorithms;
-- [**`BetaML.Utils`**](Utils.html): Various utility functions (scale, one-hot, distances, kernels, pca, accuracy/error measures..).
+- [**`BetaML.Perceptron`**](@ref BetaML.Perceptron): The Perceptron, Kernel Perceptron and Pegasos classification algorithms;
+- [**`BetaML.Trees`**](@ref BetaML.Trees): The Decision Trees and Random Forests algorithms for classification or regression (with missing values supported);
+- [**`BetaML.Nn`**](@ref BetaML.Nn): Implementation of Artificial Neural Networks;
+- [**`BetaML.Clustering`**](@ref BetaML.Clustering): (hard) Clustering algorithms (K-Means, K-Mdedoids)
+- [**`BetaML.GMM`**](@ref BetaML.GMM): Various algorithms (Clustering, regressor, missing imputation / collaborative filtering / recommandation systems) that use a Generative (Gaussian) mixture models (probabilistic) fitter, fitted using a EM algorithm;
+- [**`BetaML.Imputation`**](@ref BetaML.Imputation): Imputation algorithms;
+- [**`BetaML.Utils`**](@ref BetaML.Utils): Various utility functions (scale, one-hot, distances, kernels, pca, accuracy/error measures..).
 
 ## [Available models](@id models_list)
 
diff --git a/docs/src/tutorials/Betaml_tutorial_getting_started.md b/docs/src/tutorials/Betaml_tutorial_getting_started.md
index 5a4505fa..34ab7f7e 100644
--- a/docs/src/tutorials/Betaml_tutorial_getting_started.md
+++ b/docs/src/tutorials/Betaml_tutorial_getting_started.md
@@ -233,6 +233,7 @@ Using either the direct call or the `eval` function, wheter in `Pyjulia` or `Jul
 > acc      <- julia_call("accuracy",yhat,ys,ignorelabels=TRUE)
 > acc
 [1] 0.8933333
+```
 
 ```@raw html
 <details><summary>Details</summary>
@@ -307,7 +308,7 @@ While other "convenience" functions are provided by the package, using  `julia_c
 </details>
 ```
 
-## [Dealing with stochasticity and reproducibility](@id dealing_with_stochasticity)
+## [Dealing with stochasticity and reproducibility](@id stochasticity_reproducibility)
 
 Machine Learning workflows include stochastic components in several steps: in the data sampling, in the model initialisation and often in the models's own algorithms (and sometimes also in the prediction step).
 All BetaML models with a stochastic components support a `rng` parameter, standing for _Random Number Generator_. A RNG is a "machine" that streams a flow of random numbers. The flow itself however is deterministically determined for each "seed" (an integer number) that the RNG has been told to use.
diff --git a/docs/src/tutorials/Classification - cars/betaml_tutorial_classification_cars.jl b/docs/src/tutorials/Classification - cars/betaml_tutorial_classification_cars.jl
index a49fcd2d..6f08e50c 100644
--- a/docs/src/tutorials/Classification - cars/betaml_tutorial_classification_cars.jl	
+++ b/docs/src/tutorials/Classification - cars/betaml_tutorial_classification_cars.jl	
@@ -1,7 +1,7 @@
 # # [A classification task when labels are known - determining the country of origin of cars given the cars characteristics](@id classification_tutorial)
 
 # In this exercise we are provided with several technical characteristics (mpg, horsepower,weight, model year...) for several car's models, together with the country of origin of such models, and we would like to create a machine learning model such that the country of origin can be accurately predicted given the technical characteristics.
-# As the information to predict is a multi-class one, this is a _[classification]_(https://en.wikipedia.org/wiki/Statistical_classification) task.
+# As the information to predict is a multi-class one, this is a _[classification](https://en.wikipedia.org/wiki/Statistical_classification) task.
 # It is a challenging exercise due to the simultaneous presence of three factors: (1) presence of missing data; (2) unbalanced data - 254 out of 406 cars are US made; (3) small dataset.
 
 #
@@ -41,7 +41,7 @@ using  Test     #src
 println(now(), " - getting the data..." )  #src
 
 # Machine Learning workflows include stochastic components in several steps: in the data sampling, in the model initialisation and often in the models's own algorithms (and sometimes also in the prediciton step).
-# BetaML provides a random nuber generator  (RNG) in order to simplify reproducibility ( [`FIXEDRNG`](@ref BetaML.Utils.FIXEDRNG). This is nothing else than an istance of `StableRNG(123)` defined in the [`BetaML.Utils`](@ref utils_module) sub-module, but you can choose of course your own "fixed" RNG). See the [Dealing with stochasticity](@ref dealing_with_stochasticity) section in the [Getting started](@ref getting_started) tutorial for details.
+# BetaML provides a random nuber generator  (RNG) in order to simplify reproducibility ( [`FIXEDRNG`](@ref BetaML.Utils.FIXEDRNG). This is nothing else than an istance of `StableRNG(123)` defined in the [`BetaML.Utils`](@ref utils_module) sub-module, but you can choose of course your own "fixed" RNG). See the [Dealing with stochasticity](@ref stochasticity_reproducibility) section in the [Getting started](@ref getting_started) tutorial for details.
 
 # Here we are explicit and we use our own fixed RNG:
 seed = 123 # The table at the end of this tutorial has been obtained with seeds 123, 1000 and 10000
diff --git a/docs/src/tutorials/Clustering - Iris/betaml_tutorial_cluster_iris.jl b/docs/src/tutorials/Clustering - Iris/betaml_tutorial_cluster_iris.jl
index d7165dda..4c92c44d 100644
--- a/docs/src/tutorials/Clustering - Iris/betaml_tutorial_cluster_iris.jl	
+++ b/docs/src/tutorials/Clustering - Iris/betaml_tutorial_cluster_iris.jl	
@@ -50,7 +50,7 @@ y  = fit!(OrdinalEncoder(categories=yLabels),iris[:,5])
 
 # The dataset from RDatasets is ordered by species, so we need to shuffle it to avoid biases.
 # Shuffling happens by default in cross_validation, but we are keeping here a copy of the shuffled version for later.
-# Note that the version of [`shuffle`](@ref) that is included in BetaML accepts several n-dimensional arrays and shuffle them (by default on rows, by we can specify the dimension) keeping the association  between the various arrays in the shuffled output.
+# Note that the version of [`consistent_shuffle`](@ref) that is included in BetaML accepts several n-dimensional arrays and shuffle them (by default on rows, by we can specify the dimension) keeping the association  between the various arrays in the shuffled output.
 (xs,ys) = consistent_shuffle([x,y], rng=copy(AFIXEDRNG));
 
 
diff --git a/docs/src/tutorials/Regression - bike sharing/betaml_tutorial_regression_sharingBikes.jl b/docs/src/tutorials/Regression - bike sharing/betaml_tutorial_regression_sharingBikes.jl
index c3142e03..5e895bcf 100644
--- a/docs/src/tutorials/Regression - bike sharing/betaml_tutorial_regression_sharingBikes.jl	
+++ b/docs/src/tutorials/Regression - bike sharing/betaml_tutorial_regression_sharingBikes.jl	
@@ -67,7 +67,7 @@ println(now(), " ", "- decision trees..." )  #src
 
 m = DecisionTreeEstimator(autotune=true, rng=copy(AFIXEDRNG))
 
-# Passing a fixed Random Number Generator (RNG) to the `rng` parameter guarantees that everytime we use the model with the same data (from the model creation downward to value prediciton) we obtain the same results. In particular BetaML provide `FIXEDRNG`, an istance of `StableRNG` that guarantees reproducibility even across different Julia versions. See the section ["Dealing with stochasticity"](@ref dealing_with_stochasticity) for details. 
+# Passing a fixed Random Number Generator (RNG) to the `rng` parameter guarantees that everytime we use the model with the same data (from the model creation downward to value prediciton) we obtain the same results. In particular BetaML provide `FIXEDRNG`, an istance of `StableRNG` that guarantees reproducibility even across different Julia versions. See the section ["Dealing with stochasticity"](@ref stochasticity_reproducibility) for details. 
 # Note the `autotune` parameter. BetaML has perhaps what is the easiest method for automatically tuning the model hyperparameters (thus becoming in this way _learned_ parameters). Indeed, in most cases it is enought to pass the attribute `autotune=true` on the model constructor and hyperparameters search will be automatically performed on the first `fit!` call.
 # If needed we can customise hyperparameter tuning, chosing the tuning method on the parameter `tunemethod`. The single-line above is equivalent to:
 tuning_method = SuccessiveHalvingSearch(
@@ -80,8 +80,8 @@ m_dt = DecisionTreeEstimator(autotune=true, rng=copy(AFIXEDRNG), tunemethod=tuni
 
 # Note that the defaults change according to the specific model, for example `RandomForestEstimator`](@ref) autotuning default to not being multithreaded, as the individual model is already multithreaded.
 
-# !!! Tip 
-#     Refer to [versions of this tutorial for BetaML <= 0.6](/BetaML.jl/v0.7/tutorials/Regression - bike sharing/betaml_tutorial_regression_sharingBikes.html) for a good exercise on how to perform model selection using the [`cross_validation`](@ref) function, or even by custom grid search.
+# !!! tip
+#     Refer to the versions of this tutorial for BetaML <= 0.6 for a good exercise on how to perform model selection using the [`cross_validation`](@ref) function, or even by custom grid search.
 
 # We can now fit the model, that is learn the model parameters that lead to the best predictions from the data. By default (unless we use `cache=false` in the model constructor) the model stores also the training predictions, so we can just use `fit!()` instead of `fit!()` followed by `predict(model,xtrain)`
 ŷtrain = fit!(m_dt,xtrain,ytrain) 
@@ -307,9 +307,9 @@ y = data[:,16];
 ((xtrain,xtest),(ytrain,ytest)) = partition([x,y],[0.75,1-0.75],shuffle=false)
 (ntrain, ntest) = size.([ytrain,ytest],1)
 
-# An other common operation with neural networks is to scale the feature vectors (X) and the labels (Y). The BetaML [`scale`](@ref) function, by default, scales the data such that each dimension has mean 0 and variance 1.
+# An other common operation with neural networks is to scale the feature vectors (X) and the labels (Y). The BetaML [`Scaler`](@ref) model, by default, scales the data such that each dimension has mean 0 and variance 1.
 
-# Note that we can provide the function with different scale factors or specify the columns that shoudn't be scaled (e.g. those resulting from the one-hot encoding). Finally we can reverse the scaling (this is useful to retrieve the unscaled features from a model trained with scaled ones).
+# Note that we can provide the `Scaler`` model with different scale factors or specify the columns that shoudn't be scaled (e.g. those resulting from the one-hot encoding). Finally we can reverse the scaling (this is useful to retrieve the unscaled features from a model trained with scaled ones).
 
 cols_nottoscale = [2;4;5;10:23]
 xsm             = Scaler(skip=cols_nottoscale)
diff --git a/src/Bmlj/Bmlj.jl b/src/Bmlj/Bmlj.jl
index 0a4ee011..42fc6d63 100644
--- a/src/Bmlj/Bmlj.jl
+++ b/src/Bmlj/Bmlj.jl
@@ -27,8 +27,8 @@ import ..BetaML
 import ..Utils # can't using it as it exports some same-name models
 import ..Perceptron
 import ..Nn: AbstractLayer, ADAM, SGD, NeuralNetworkEstimator, OptimisationAlgorithm, DenseLayer, NN
-import ..Utils: AbstractRNG, squared_cost, SuccessiveHalvingSearch
-
+import ..Utils: AbstractRNG, squared_cost, SuccessiveHalvingSearch, radial_kernel
+import ..GMM
 
 export mljverbosity_to_betaml_verbosity
 
diff --git a/src/Clustering/Clustering.jl b/src/Clustering/Clustering.jl
index f138cd37..ef2e6f40 100644
--- a/src/Clustering/Clustering.jl
+++ b/src/Clustering/Clustering.jl
@@ -7,12 +7,12 @@ Part of [BetaML](https://github.com/sylvaticus/BetaML.jl). Licence is MIT.
 
 (Hard) Clustering algorithms 
 
-Provide hard clustering methods using K-means and k-medoids. Please see also the [`GMM`](@ref) module for GMM-based soft clustering (i.e. where a probability distribution to be part of the various classes is assigned to each record instead of a single class), missing values imputation / collaborative filtering / reccomendation systems using clustering methods as backend.
+Provide hard clustering methods using K-means and K-medoids. Please see also the `GMM` module for GMM-based soft clustering (i.e. where a probability distribution to be part of the various classes is assigned to each record instead of a single class), missing values imputation / collaborative filtering / reccomendation systems using clustering methods as backend.
 
 The module provides the following models. Use `?[model]` to access their documentation:
 
-- [`KMeansClusterer`](@ref): Classical KMean algorithm
-- [`KMedoidsClusterer`](@ref kmeans): Kmedoids algorithm with configurable distance metric
+- [`KMeansClusterer`](@ref): Classical K-mean algorithm
+- [`KMedoidsClusterer`](@ref): K-medoids algorithm with configurable distance metric
 
 Some metrics of the clustered output are available (e.g. [`silhouette`](@ref)).
 """
@@ -28,7 +28,6 @@ using  ForceImport
 import Base.print
 import Base.show
 
-# export kmeans, kmedoids
 export KMeansC_hp, KMedoidsC_hp, KMeansClusterer, KMedoidsClusterer 
 
 include("Clustering_hard.jl") # K-means and k-medoids
diff --git a/src/GMM/GMM.jl b/src/GMM/GMM.jl
index 1ba1f182..3edda56e 100644
--- a/src/GMM/GMM.jl
+++ b/src/GMM/GMM.jl
@@ -7,7 +7,7 @@ Generative (Gaussian) Mixed Model learners (supervised/unsupervised)
 
 Provides clustering and regressors using  (Generative) Gaussiam Mixture Model (probabilistic).
 
-Collaborative filtering / missing values imputation / reccomendation systems based on GMM is available in the [`Imputation`](@ref BetaML.Imputation) module.
+Collaborative filtering / missing values imputation / reccomendation systems based on GMM is available in the `Imputation` module.
 
 The module provides the following models. Use `?[model]` to access their documentation:
 
diff --git a/src/GMM/GMM_clustering.jl b/src/GMM/GMM_clustering.jl
index 941a9809..f5603e4b 100644
--- a/src/GMM/GMM_clustering.jl
+++ b/src/GMM/GMM_clustering.jl
@@ -203,7 +203,7 @@ mutable struct GaussianMixture_hp <: BetaMLHyperParametersSet
     maximum_iterations::Int64
     """
     The method - and its parameters - to employ for hyperparameters autotuning.
-    See [`SuccessiveHalvingSearch](@ref) for the default method (suitable for the GMM-based regressors)
+    See [`SuccessiveHalvingSearch`](@ref) for the default method (suitable for the GMM-based regressors)
     To implement automatic hyperparameter tuning during the (first) `fit!` call simply set `autotune=true` and eventually change the default `tunemethod` options (including the parameter ranges, the resources to employ and the loss function to adopt).
     """
     tunemethod::AutoTuneMethod
diff --git a/src/Utils/Utils.jl b/src/Utils/Utils.jl
index 388bb162..37151cf8 100644
--- a/src/Utils/Utils.jl
+++ b/src/Utils/Utils.jl
@@ -1,6 +1,5 @@
 "Part of [BetaML](https://github.com/sylvaticus/BetaML.jl). Licence is MIT."
 
-
 """
     Utils module
 
@@ -30,7 +29,7 @@ For the complete list of functions provided see below. The main ones are:
 
 ## Measures
 - Several functions of a pair of parameters (often `y` and `ŷ`) to measure the goodness of `ŷ`, the distance between the two elements of the pair, ...
-- Includes "classical" distance functions ([`l1_distance`](@ref), [`l2_distance`](@ref), [`l2squared_distance`](@ref) [`cosine_distance`](@ref)), "cost" functions for continuous variables ([`squared_cost`](@ref), [`mean_relative_error`](@ref)) and comparision functions for multi-class variables ([`crossentropy`](@ref), [`accuracy`](@ref), [`ConfusionMatrix`](@ref), [`silhouette`](@ref))
+- Includes "classical" distance functions ([`l1_distance`](@ref), [`l2_distance`](@ref), [`l2squared_distance`](@ref) [`cosine_distance`](@ref)), "cost" functions for continuous variables ([`squared_cost`](@ref), [`relative_mean_error`](@ref)) and comparision functions for multi-class variables ([`crossentropy`](@ref), [`accuracy`](@ref), [`ConfusionMatrix`](@ref), [`silhouette`](@ref))
 - Distances can be used to compute a pairwise distance matrix using the function [`pairwise`](@ref)
 
 """
diff --git a/src/Utils/Utils_extra.jl b/src/Utils/Utils_extra.jl
index 85917742..5f932e52 100644
--- a/src/Utils/Utils_extra.jl
+++ b/src/Utils/Utils_extra.jl
@@ -6,7 +6,7 @@ export   AutoEncoder, AutoE_hp
 @force using ..Nn
 
 import ..Nn: AbstractLayer, ADAM, SGD, NeuralNetworkEstimator, OptimisationAlgorithm, DenseLayer, NN
-import Imputation
+import ..Imputation
 
 
 """