Skip to content

Latest commit

 

History

History
82 lines (53 loc) · 1.88 KB

README.rst

File metadata and controls

82 lines (53 loc) · 1.88 KB

pytorch_speech_features

A simple PyTorch reimplementation of library python_speech_features.

Uses

  • Great for Intepretability experiments - All audio processing operations can be performed and the results can be backpropagated to the original signal tensor.
  • Supports Hybrid Model Design - Parametric operations at different stages of audio processing.

Example use

Installation

Install from PyPI
pip install pytorch-speech-features
Install from GitHub
git clone https://github.com/Debjoy10/pytorch_speech_features
python setup.py develop

Usage

Functions same as python_speech_features (Refer to its documentation here).

Instead of input signal as list / numpy array, pass torch tensor (both ‘cpu’ and ‘cuda’ supported!!).

See example use given above.

Supported features:
  • Mel Frequency Cepstral Coefficients
  • Filterbank Energies
  • Log Filterbank Energies
  • Spectral Subband Centroids

Testing

Two things to test for pytorch_speech_features operations -

  • Similarity to python_speech_features outputs.
  • Gradient correctness via Autograd Gradcheck.

Find the testing python notebook here -

Open In Colab

Citation

TODO

References

  • Python_speech_features library - Link
  • Sample english.wav - Link