Deep Music Classifier

Note

Still I haven't run any complete training procedure, but after I do and verify it, the plan is to use this specific classifier for music generation based on genre, which, to the best of my knowledge, still hasn't been done.

If you need assistance running the project or have a question, please email me on [email protected]

About

The ideal goal of this project is to be able to say "This part of the song has the elements of jazz, progressive rock and a bit of grunge.". This could be possible to achieve defining the problem as multi-output classification.

Deep model is based on [Dec 2016.] Convolutional Recurrent Neural Networks for Music Classification (Keunwoo Choi, George Fazekas, Mark Sandler, Kyunghyun Cho) [1], i.e. using convolutional recurrent neural network deep model for multi-output classification task (tagging each music piece using a subset of labels).

Prerequisite

To be able to run all parts of this project, you will need the following additional Python packages (recommended is Python 3.6):

keras - build and train the high-level model
librosa - extract mel-spectrograms
pandas - analyze FMA metadata
numpy - efficiently work with linear algebra operations
tensorflow (GPU recommended) - modify keras backend
matplotlib - plot various graphs and use it extract librosa spectrograms

Input features

Mel-spectrograms are extracted from .mp3s and used as model inputs. An example of such a spectrogram is:

However, when generating images for the model, image is generated a bit differently - spectrogram values matrix is dumped into an image in grayscale. Information is preserved this way and there is only one input layer for convolution instead of three. An example of such an image is:

Other spectrograms could also be used as described and compared in detail in [5]. In this work, except mel-spectrograms, raw audio input will also be tested [6].

Data

Using FMA dataset (A Dataset For Music Analysis) [2]. It is a collection of freely available MP3s (under Creative Commons license) most convenient for research projects and (currently) only publicly available music dataset of a kind. Top 16 genres distribution is shown in the following histogram:

Usage

take a look at and download FMA dataset metadata (342 MiB). For more details, check this repo.
Then download small or medium; try with smaller versions first to set things up and then switch to large. I won't use full version as input images then have various sizes and it's anyways to large for my computing resources plus I believe there is more than enough information in 30s trimmed tracks.
Extract mel-spectrograms from mp3s running as main module.
Generate relevant metadata running as main module.
Run to build, compile and train a keras model (CRNN architecture mentioned above).

Project structure:

data/
- fma_{size}/
  - 000/
  - 001/
- fma_metadata/
  - genres.csv
  - tracks.csv
in/
- mel-specs/
  - 000/
  - 001/
- metadata/
  - test.csv
  - train.csv
  - valid.csv
out/
- graphs/
- logs/
src/
- main.py
- mel-spec.py
- metadata.py
- model.py
- utility.py

Results

I still didn't run the whole training process...

CrowdAI competition (music genre classification - 16 classes)

Source code for this project also contains separate folder for CrowdAI competition. Main focus of this project in the next 60 days will be gaining better position on the leaderboard.

Relevant literature

[1] CRNN for Music Classification

[2] FMA: A Dataset For Music Analysis

[3] Music Information Retrival (origin of "MIR", Downie)

[4] A Tutorial on Deep Learning for Music Information Retrieval

[5] Comparison on Audio Signal Preprocessing Methods for Deep Neural Networks on Music Tagging

[6] End-to-end learning for music audio tagging at scale (1D convolution)

For broader references on music information retrieval, check https://github.com/ybayle/awesome-deep-learning-music.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
in/metadata		in/metadata
out		out
src		src
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
notes.txt		notes.txt
presentation.pdf		presentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Music Classifier

Note

About

Prerequisite

Input features

Data

Usage

Results

CrowdAI competition (music genre classification - 16 classes)

Relevant literature

About

Releases

Packages

Languages

License

kristijanbartol/Deep-Music-Tagger

Folders and files

Latest commit

History

Repository files navigation

Deep Music Classifier

Note

About

Prerequisite

Input features

Data

Usage

Results

CrowdAI competition (music genre classification - 16 classes)

Relevant literature

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages