This project is a semestral work for course NI-VMM on FIT CTU. It is a Shazam-like app that returns a set of similar audio tracks for an audio query.
- server - web server
- core - core module with different engine types - MFCC, Chromaprint with bit error similarity, Chromaprint with cross correlation similarity
- app.py - Flask server wrapper exposing
/search
and/audiotracks/{filename}
endpoints. - db.py - database construction script.
- get_engine.py - function to initialize correct engine type. It has
-e
parameter to set an engine type (mfcc
,chromaprint
orchromaprint_cc
) and-c
parameter which will erase the database before constructing it.
- client - client web React app
- Jupyter notebook with prototypes
The project consists of a python Flask server, MongoDB database and React web client. Everything runs in Docker containers, so Docker needs to be installed on the target system.
- Add
.env
file with all required environment variables into the root of project. A sample can be found in.env.example
- Make sure you have
data
folder with audiofiles (mp3, wav,..) created locally. - Run
make run
which will build and start all docker containers - Run
make db
to construct the reference database. It will take all audio files from thedata
folder. - (optional) If you want to also use
MFCCEngine
, runmake db_mfcc
. Note it will last much longer thanmake db
.
This will start the client web app on address localhost:3000
and the server on localhost:5000
. You can record an audio or upload an audiofile in the client web app to search for similar tracks in the reference dataset.
Other useful commands can be found in the Makefile.