Implemented features
- 2/3D batched convolution & transposed convolution and their backprop calculations
- batch normalization and its backprop calculation
- 2/3D max ppoling & average pooling (including backprop)
- MLP (linear transform) layer
- simple activation layers (ReLU, LeakyReLU, softmax)
- cross entropy loss, Dice loss, mean square error (MSE) loss
- optimized convolution operations (AVX+OpenMP) and high performance matrix multiplication
- a simple builtin tensor library written from scratch
- a simple mechanic to save & load tensor data (serialize & deserialize)