Skip to content

TensorFlow implementation of a VAE for encoding spectrograms

Notifications You must be signed in to change notification settings

naotokui/SpectrogramVAE

 
 

Repository files navigation

Spectrogram VAE

TensorFlow implementation of a Variational Autencoder with Inverse Autoregressive Flows for encoding spectrograms.

This is the main model I used for my NeuralFunk project.

This code was not really intended to be shared and is quite messy. I might improve it at some point in the future, but for now be aware that everything is quite hacky and badly documented.

Acknowledgments

Overview

Some random experiments, as well as the creation of the dataset for the VAE can be found in Preprocessing and Experiments.ipynb.

The dataset pickle file has to be a dictionary of the form

{
  'filenames' : list_of_filenames,
  'melspecs' : list_of_spectrogram_arrays,
  'actual_lengths' : list_of_audio_len_in_sec
}

and be stored as dataset.pkl in the root directory.

Training the VAE

python train.py

Generating samples

Based on

  • Sampling from latent space: python generate.py
  • Single input file: python generate.py --file_in filename
  • Multiple input files: python generate.py --file_in list_of_filenames

Encode audio

  • Single file: python encode_and_reconstruct.py --audio_file filename
  • Full dataset: python encode_and_reconstruct.py --encode_full true

Finding similar files:

python find_similar.py --target target_audio_file --sample_dirs list_of_dirs_to_search

All the above scripts have other options and uses as well, look into the code for more details.

About

TensorFlow implementation of a VAE for encoding spectrograms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.6%
  • Python 3.4%