- Description and Installation
- Scale-Space Blob Detector
- Melanoma Segmentation
- Melanoma Classification with CNNs
- Object Detection with Faster-RCNN
- Feature Selection for Audio Classification
- Audio Speech Recognition with DeepSpeech2
Audio processing, Video processing and Computer Vision Laboratories (UC3M - C2.350.16508).
Create a Python 3.6 virtual environment and run the following command:
pip install -r requirements.txt
Or specify the name of the project to install specific requirements.
pip install -r <PROJECT NAME>/requirements.txt
PIP ENVIRONMENT
pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
CONDA ENVIRONMENT
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
Scale-space blob detector based on the Laplacian of Gaussian (LoG) filter. Full guideline here.
Pre-processing, segmentation and post-processing for melanoma images using thresholding and clustering techniques. Full guideline here.
Testing of several CNN architectures for melanoma classification (no melanoma, melanoma, keratosis) Full lab here.
Faster-RCNN implementation for object detection and classification using a subset of the PASCAL VOC 2012 database. Full lab here.
Feature extraction and selection for classifying dogs and cats audios using SVM. Full guideline here.
Comparison of 3 speech recognition architectures based on DeepSpeech2 altering the GRU layer implementation. Full lab here.