Multimodal Research in Automatic Persuasion Recognition
This repository presents a multi-modal approach to investigating, analyzing, and interpreting persuasive communication using machine learning techniques.
In this paper, we explore the persuasive Opinion Multimedia (POM) dataset and propose a hypothesis based on critical features. We hypothesize that:
- Speaker persuasion can be predicted using text and audio features by employing supervised machine learning methods.
- Bi-modal classification yields higher prediction accuracy compared to uni-modal classification for the specific modalities of text and audio.
Our approach involves collating (fusing) the text and audio modalities to create a bi-modal dataset. We then apply a Random Forest Classifier to predict speaker persuasion. We evaluate the accuracy of our model and consider the implications of the research design on the generalizability of the results.
By combining the text and audio modalities, we achieve an accuracy score of 71.68% using the bi-modal Random Forest Classifier. However, it is important to note the limitations and considerations associated with the chosen research design and its impact on the generalizability of the presented results.
data/
: This directory contains the POM dataset used for analysis.code/
: This directory contains the Python scripts used for data preprocessing, model training, and evaluation.models/
: This directory contains the trained models saved in the appropriate format.results/
: This directory contains the evaluation results and performance metrics of the models.
- Clone this repository:
git clone https://github.com/your-username/persuasive-communication-analysis.git