Skip to content

Latest commit

 

History

History
121 lines (78 loc) · 4.12 KB

File metadata and controls

121 lines (78 loc) · 4.12 KB

DepSemo

Audio sentiment analysis deep learning tool

A research tool for anybody can build, train, test and analysis deep learning models on audio data for the purpose of emotion classification.

**TR Bu çalışma Veri Bilimi dergi'sinde yayınlanmıştır. Link **

**EN This article was published in the Veri Bilimi (Data Science) journal. Link **

Features

  • Download and unarchive auido emotion related datasets.
  • Auto-Decompose dataset labels -meta data creation-.
  • Create and save Audio Features like MFCC.
  • Create and save Deep Learning models, for audio sentiment analysis, audio emotion classification.
  • Live monitor for training with TensorBoard.
  • Test models before use.

Installation

git clone https://github.com/COMUProjectTeam/audio-sentiment-analysis-deep-learning-tool

then go to directory

cd "DepSemo root directory path"

then install necessary packages using pip

pip install -r requirements.txt

for run program run app.py

python3 app.py or python app.py

How to Use

After running app.py from your terminal, go to http://127.0.0.1:5000/ with your browser. If everythings went right, you have to see DepSemo main page. You can navigate between modules via toolbar on left.

Remember, you have to download dataset for create metadata, and have to train a model before test a model, so go one by one for create a classifier model!







Known Datasets

Hosting public datasets in a AWS S3 bucket for fast download and a link that we're sure works.

Dataset Official Page
RAVDESS [1] Link
SAVEE [2] Link
Emo-DB [3] Link
CREMA-D [4] Link

Backends

Diffrent tools and frameworks has been used for accomplish this task.

Used For Framework & Tool
FLASK Web backend
KERAS Deep learning API with TensorFlow backend
Librosa Audio feature extractions and data augmentation
TensorBoard Training live feedback GUI
SQLITE Local database
AWS S3 Cloud database
Pandas & Numpy Generic usage

System Architecture

Contributors

License and Citation

  • Under MIT license.

Screenshots

In Feature extraction page, we can extract desired audio features from per auido.


In training page we can set validation test split, batch size, epoch count etc.

Citation for datasets.

[1]     Livingstone SR, Russo FA. “The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)  A dynamic, multimodal set of facial and vocal expressions in North American English”. PIoS one, 13(5), e0196391, 2018.

[2]     Haq S, Jackson PJB. “Speaker-Dependent Audio-Visual Emotion Recognition (SAVEE)”. AVSP, 53-58, 2009.

[3]     Burkhardt F, Paescheke A, Rolfes M, Sendlmeier F, Weiss B.  “A database of German emotional speech”.  9th European Conference on Speech Communication and Technology, 2005.

[4]     Cao H, Copper DG, Keutmann MK, Gur RC, Nenkova A, Verma R. “CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset”. IEEE Transactions on Affective Computing, 5(4), 377-390, 2014.