A simple PyTorch reimplementation of library python_speech_features.
- Great for Intepretability experiments - All audio processing operations can be performed and the results can be backpropagated to the original signal tensor.
- Supports Hybrid Model Design - Parametric operations at different stages of audio processing.
Install from PyPI
pip install pytorch-speech-features
Install from GitHub
git clone https://github.com/Debjoy10/pytorch_speech_features python setup.py develop
Functions same as python_speech_features (Refer to its documentation here).
Instead of input signal as list / numpy array, pass torch tensor (both ‘cpu’ and ‘cuda’ supported!!).
See example use given above.
Supported features:
- Mel Frequency Cepstral Coefficients
- Filterbank Energies
- Log Filterbank Energies
- Spectral Subband Centroids
Two things to test for pytorch_speech_features operations -
- Similarity to python_speech_features outputs.
- Gradient correctness via Autograd Gradcheck.
TODO