Multi-Scale Chunk Regularization for Audio-Text Emotion Recognition

This is a Pytorch implementation of the paper: Enhancing Resilience to Missing Data in Audio-Text Emotion Recognition with Multi-Scale Chunk Regularization. The experiments and trained models were based on the MSP-Podcast v1.10 corpus & pretrained wav2vec2-large-robust (audio) and RoBERTa-base (text) features in the paper.

Suggested Environment and Requirements

Python 3.6+
Ubuntu 18.04+
pytorch version 1.9.0+cu102
huggingface transformer version 4.5.1
textgrid
The scipy, numpy and pandas...etc standard packages
The MSP-Podcast v1.10 corpus (request to download from UTD-MSP lab website)

How to Run

Place the downloaded MSP-Podcast v1.10 corpus under the same directory.
Extract the pretrained deep features using feat_wav2vec2.py (for audio) and feat_roberta.py (for text), the outputs will be saved in the created 'Features' folder.
[Optional] Run norm_para.py to obtain parameters for z-norm, the outputs will be saved in the created 'NormTerm_Speech' and 'NormTerm_Text' folders.
Run the main script,

python main.py

The testing results (in terms of CCC) of the trained model will be directly printed out after the training is done.

Reference

If you use this code, please cite the following paper:

Wei-Cheng Lin, Lucas Goncalves and Carlos Busso, "Enhancing Resilience to Missing Data in Audio-Text Emotion Recognition with Multi-Scale Chunk Regularization"

@InProceedings{Lin_2023_3, 
	author={W.-C. Lin and L. Goncalves and C. Busso}, 
	title={Enhancing Resilience to Missing Data in Audio-Text Emotion Recognition with Multi-Scale Chunk Regularization},
	booktitle={ACM International Conference on Multimodal Interaction (ICMI 2023)},  
	volume={To appear},
	year={2023}, 
	month={October}, 
	address =  {Paris, France},
	pages={}, 
	doi={},
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
NormTerm_Speech		NormTerm_Speech
NormTerm_Text		NormTerm_Text
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataloader.py		dataloader.py
feat_roberta.py		feat_roberta.py
feat_wav2vec2.py		feat_wav2vec2.py
main.py		main.py
model.py		model.py
norm_para.py		norm_para.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Scale Chunk Regularization for Audio-Text Emotion Recognition

Suggested Environment and Requirements

How to Run

Reference

About

Releases

Packages

Languages

License

winston-lin-wei-cheng/MultiScale-Chunk-Regularization

Folders and files

Latest commit

History

Repository files navigation

Multi-Scale Chunk Regularization for Audio-Text Emotion Recognition

Suggested Environment and Requirements

How to Run

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages