From ca90754100772ba11089c852602ae1de4d018302 Mon Sep 17 00:00:00 2001
From: Antonello Lobianco <antonello@lobianco.org>
Date: Thu, 25 Jan 2024 13:57:22 +0100
Subject: [PATCH] Added explicitly that ConvLAyer and PoolingLAyers are
 experimental

---
 src/Nn/Nn.jl                           | 23 ++++++++++++++---------
 src/Nn/default_layers/ConvLayer.jl     |  8 +++++++-
 src/Nn/default_layers/PoolingLayer.jl  |  6 +++++-
 src/Nn/default_layers/ReshaperLayer.jl |  2 ++
 4 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/src/Nn/Nn.jl b/src/Nn/Nn.jl
index d6f46e76..541e0aae 100644
--- a/src/Nn/Nn.jl
+++ b/src/Nn/Nn.jl
@@ -12,16 +12,21 @@ The module provide the following types or functions. Use `?[type or function]` t
 
 # Model definition:
 
-- `DenseLayer`: Classical feed-forward layer with user-defined activation function
-- `DenseNoBiasLayer`: Classical layer without the bias parameter
-- `VectorFunctionLayer`: Layer whose activation function run over the ensable of its nodes rather than on each one individually. No learnable weigths on input, optional learnable weigths as parameters of the activation function.
-- `ScalarFunctionLayer`: Layer whose activation function run over each node individually, like a classic `DenseLqyer`, but with no learnable weigths on input and optional learnable weigths as parameters of the activation function.
-- `ReplicatorLayer`: Alias for a `ScalarFunctionLayer` with no learnable parameters and identity as activation function
-- `GroupedLayer`: To stack several layers into a single layer, e.g. for multi-branches networks
-- `NeuralNetworkEstimator`: Build the chained network and define a cost function
+- [`DenseLayer`](@ref): Classical feed-forward layer with user-defined activation function
+- [`DenseNoBiasLayer`](@ref): Classical layer without the bias parameter
+- [`VectorFunctionLayer`](@ref): Layer whose activation function run over the ensable of its nodes rather than on each one individually. No learnable weigths on input, optional learnable weigths as parameters of the activation function.
+- [`ScalarFunctionLayer`](@ref): Layer whose activation function run over each node individually, like a classic `DenseLayer`, but with no learnable weigths on input and optional learnable weigths as parameters of the activation function.
+- [`ReplicatorLayer`](@ref): Alias for a `ScalarFunctionLayer` with no learnable parameters and identity as activation function
+- [`ReshaperLayer`](@ref): Reshape the output of a layer (or the input data) to the shape needed for the next one
+- [`PoolingLayer`](@ref): In the middle between `VectorFunctionLayer` and `ScalarFunctionLayer`, it applyes a function to the set of nodes defined in a sliding kernel. Weightless.
+- [`ConvLayer`](@ref): A generic N+1 (channels) dimensional convolutional layer 
+- [`GroupedLayer`](@ref): To stack several layers into a single layer, e.g. for multi-branches networks
+- [`NeuralNetworkEstimator`](@ref): Build the chained network and define a cost function
+
+Each layer can use a default activation function, one of the functions provided in the `Utils` module (`relu`, `tanh`, `softmax`,...) or one provided by you.
+BetaML will try to recognise if it is a "known" function for which it sets the exact derivatives, otherwise you can normally provide the layer with it.
+If the derivative of the activation function is not provided (either manually or automatically), AD will be used and training may be slower, altought this difference tends to vanish with bigger datasets.
 
-
-Each layer can use a default activation function, one of the functions provided in the `Utils` module (`relu`, `tanh`, `softmax`,...) or you can specify your own function. The derivative of the activation function can be optionally be provided, in such case training will be quicker, altought this difference tends to vanish with bigger datasets.
 You can alternativly implement your own layer defining a new type as subtype of the abstract type `AbstractLayer`. Each user-implemented layer must define the following methods:
 
 - A suitable constructor
diff --git a/src/Nn/default_layers/ConvLayer.jl b/src/Nn/default_layers/ConvLayer.jl
index d066d26b..758ad3d2 100644
--- a/src/Nn/default_layers/ConvLayer.jl
+++ b/src/Nn/default_layers/ConvLayer.jl
@@ -5,7 +5,13 @@
 """
 $(TYPEDEF)
 
-Representation of a convolutional layer in the network
+A generic N+1 (channels) dimensional convolutional layer
+
+**EXPERIMENTAL**: Still too slow for practical applications
+
+This convolutional layer has two constructors, one with the form `ConvLayer(input_size,kernel_size,nchannels_in,nchannels_out)`, and an alternative one as `ConvLayer(input_size_with_channel,kernel_size,nchannels_out)`.
+If the input is a vector, use a [`ReshaperLayer`](@ref) in front.
+
 
 # Fields:
 $(TYPEDFIELDS)
diff --git a/src/Nn/default_layers/PoolingLayer.jl b/src/Nn/default_layers/PoolingLayer.jl
index c19d3bc8..6dbc65d7 100644
--- a/src/Nn/default_layers/PoolingLayer.jl
+++ b/src/Nn/default_layers/PoolingLayer.jl
@@ -5,7 +5,11 @@
 """
 $(TYPEDEF)
 
-Representation of a pooling layer in the network
+Representation of a pooling layer in the network (weightless)
+
+**EXPERIMENTAL**: Still too slow for practical applications
+
+In the middle between `VectorFunctionLayer` and `ScalarFunctionLayer`, it applyes a function to the set of nodes defined in a sliding kernel.
 
 # Fields:
 $(TYPEDFIELDS)
diff --git a/src/Nn/default_layers/ReshaperLayer.jl b/src/Nn/default_layers/ReshaperLayer.jl
index 531ccab8..4a9aed0f 100644
--- a/src/Nn/default_layers/ReshaperLayer.jl
+++ b/src/Nn/default_layers/ReshaperLayer.jl
@@ -7,6 +7,8 @@ $(TYPEDEF)
 
 Representation of a "reshaper" (weigthless) layer in the network
 
+Reshape the output of a layer (or the input data) to the shape needed for the next one.
+
 # Fields:
 $(TYPEDFIELDS)
 """