MidiCaps: A Large-scale Dataset of Caption-annotated MIDI Files

In this repository, we provide the pipeline to extract a comprehensive set of music-specific features extracted from MIDI files. These features succinctly characterize the musical content, encompassing tempo, chord progression, time signature, instrument presence, genre, and mood. Consecutively we provide the script to generate captions from your own collection of MIDI files.

To directly download the MidiCaps dataset, please visit our huggingface dataset page: .

The below code will help you extract captions from your own collection of MIDI files, as per the framework described in our paper.

Installation Guide

git clone https://github.com/AMAAI-Lab/MidiCaps.git
cd MidiCaps
conda create -n midicaps python=3.9
pip install -r requirements.txt

User Guide

python pipeline.py --config config.cfg

You will need to download some models that we use for genre-mood extraction (indicated in config.cfg), which can be found in the following links:

genre model and metadata : https://essentia.upf.edu/models/classification-heads/mtg_jamendo_genre/
mood model and metadata : https://essentia.upf.edu/models/classification-heads/mtg_jamendo_moodtheme/
emb model : https://essentia.upf.edu/models/music-style-classification/discogs-effnet/

Also, you will need to download FluidR3_GM.sf2 from https://keymusician01.s3.amazonaws.com/FluidR3_GM.zip and replace the .sf2 file location in line 35.

Output of this will be all_files_output.json. We generate test.json from this to do in-context learning for claude 3. We provide a sample test.json and a basic script to run claude 3. Users have to add claude 3 key as environment variable ANTHROPIC_API_KEY.

export ANTHROPIC_API_KEY=<your claude 3 key>
python caption_claude.py

Please change line 59 in caption_claude.py to your preferred location.

Citation

If you use MidiCaps or code from this repo, please cite our paper:

@article{Melechovsky2024,
  author    = {Jan Melechovsky and Abhinaba Roy and Dorien Herremans},
  title     = {MidiCaps: A Large-scale MIDI Dataset with Text Captions},
  year      = {2024},
  journal   = {arXiv:2406.02255}
}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
imgs		imgs
samples		samples
LICENSE		LICENSE
README.md		README.md
caption_claude.py		caption_claude.py
config.cfg		config.cfg
index.html		index.html
instruments.csv		instruments.csv
pipeline.py		pipeline.py
requirements.txt		requirements.txt
test.json		test.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MidiCaps: A Large-scale Dataset of Caption-annotated MIDI Files

Installation Guide

User Guide

Citation

About

Releases

Packages

Contributors 4

Languages

License

AMAAI-Lab/MidiCaps

Folders and files

Latest commit

History

Repository files navigation

MidiCaps: A Large-scale Dataset of Caption-annotated MIDI Files

Installation Guide

User Guide

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages