MultiModal Intent Analysis (MMIA)

MMIA is the first platform for multimodal intent analysis, including state-of-the-art algorithms in multimodal classification, out-of-distribution detection, and clustering for intent analysis in conversational interactions. This repo supports adding datasets, algorithms, and can configure parameters conveniently.

Updates 🔥 🔥 🔥

Date	Announcements
12/2024	🎆 🎆 The first platform for multimodal intent analysis has been released. Refer to the directory MMIA for the dataset and codes.
5/2024	🎆 🎆 An unsupervised multimodal clustering method (UMC) has been released. Refer to the paper UMC.
3/2024	🎆 🎆 A token-level contrastive learning method with modality-aware prompting (TCL-MAP) has been released. Refer to the paper TCL-MAP.
1/2024	🎆 🎆 The first large-scale multimodal intent dataset has been released. Refer to the directory MIntRec2.0 for the dataset and codes. Read the paper -- MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations (Published in ICLR 2024).
10/2022	🎆 🎆 The first multimodal intent dataset is published. Refer to the directory MIntRec for the dataset and codes. Read the paper -- MIntRec: A New Dataset for Multimodal Intent Recognition (Published in ACM MM 2022).

Features

MMIA has the following features:

Large in Scale: It contains 4 datasets in total, which are MintRec, MintRec2.0, MELD-DA and IEMOCAP.
Multi-turn & Multi-party Dialogues: For example, MintRec2.0 contains 1,245 dialogues with an average of 12 utterances per dialogue in continuous conversations. Each utterance has an intent label in each dialogue. Each dialogue has at least two different speakers with annotated speaker identities for each utterance.
Out-of-distribution Detection: As real-world dialogues are in the open-world scenarios as suggested in TEXTOIR, we further include an OOD tag for detecting those utterances that do not belong to any of existing intent classes. They can be used for out-of-distribution detection and improve system robustness.

Datasets

Here we provide the details of the datasets in MMIA. You can download the datasets from the following links.

Datasets	Source
MintRec	Paper
MintRec2.0	Paper
MELD-DA	Paper
IEMOCAP-DA	Paper

Integrated Models

Here we provide the details of the models in MMIA.

Model Name	Source	Published
MULT	Paper / Code	ACL 2019
MAG_BERT	Paper / Code	ACL 2020
MCN	Paper / Code	CVPR 2020
CC	Paper / Code	AAAI 2021
MMIM	Paper / Code	EMNLP 2021
sccl	Paper / Code	NAACL 2021
USNID	Paper / Code	IEEE TKDE 2023
SDIF	Paper / Code	ICASSP 2024
TCL_MAP	Paper / Code	AAAI 2024
UMC	Paper / Code	ACL 2024

Results

Please refer to the results for the detailed results of the models in MMIA.

Quick start

Use anaconda to create Python environment

conda create --name MMIA python=3.9
conda activate MMIA

Install PyTorch (Cuda version 11.2)

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

Clone the MMIA repository.

git clone [email protected]:thuiar/MMIA.git
cd MMIA

Install related environmental dependencies
```
pip install -r requirements.txt
```
Run examples (Take mag-bert as an example, more can be seen here)
```
sh examples\multi_turn\run_mag_bert_multiturn.sh
```

Notice: You should correctly set the file path address in the .sh file.

Extensibility

a. How to add a new dataset?

Prepare Data
Create a new directory to store your dataset. You should provide the train.tsv, dev.tsv, and test.tsv. You should specify the dataset path in the .sh file。
Dataloader Setting
You need to add the new dataset name to the benchmarks list in data. You need to define the intent_labels, max_seq_lengths, ood_data, and other information for the new dataset. For example:

'MIntRec':{
  'intent_labels': [
  ],
  'max_seq_lengths': {
      'text': 30, 
      'video': 230, 
      'audio': 480, 
  },
  'ood_data':{
      'MIntRec-OOD': {'ood_label': 'UNK'}
  }

Features data To prepare features for video and audio, you need to define the feature files in features_config and prepare the corresponding files in data_path/video_data/ and audio_data/.

video_feats_path = {
    'swin-roi': 'swin_roi.pkl',#2
    # 'swin-roi': 'swin_roi_binary.pkl',#2
    'resnet-50':'video_feats.pkl',#1
    'swin-full': 'swin_feats.pkl'#tcl  ##IEMOCAP   #MELD-DA
}

b. How to add a new backbone?

Provide a new backbone in backbones and create a new model and file. For example:

from .FeatureNets import BERTEncoder, RoBERTaEncoder
# from sentence_transformers import SentenceTransformer

text_backbones_map = {
                    'bert-base-uncased': BERTEncoder,

Configure the new backbone in configs. For example:

pretrained_models_path = {
    'bert-base-uncased': '/home/sharing/disk1/pretrained_embedding/bert/uncased_L-12_H-768_A-12/',
    'bert-large-uncased':'/home/sharing/disk1/pretrained_embedding/bert/bert-large-uncased',
}

c. How to add a new method?

Provide a new backbone in backbones and create a new model and file. For example:

from .mag_bert import MAG_BERT

Configure the parameters for the new method in configs, for example mag_bert_config.
Add the new method in method and create a new model and file, for example mag_bert. You need to define the optimizer, loss function, and methods for training and testing.

from .MAG_BERT.manager import MAG_BERT
from .TEXT.manager import TEXT
from .MULT.manager import MULT


method_map = {
    'mag_bert': MAG_BERT,
    'text': TEXT,
    'mult': MULT,

Add new examples in examples, for example mag_bert.

Citations

If this work is helpful, or you want to use the codes and results in this repo, please cite the following papers:

@inproceedings{MIntRec2.0,
   title={{MI}ntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations},
   author={Zhang, Hanlei and Wang, Xin and Xu, Hua and Zhou, Qianrui and Su, Jianhua and Zhao, Jinyue and Li, Wenrui and Chen, Yanting and Gao, Kai},
   booktitle={The Twelfth International Conference on Learning Representations},
   year={2024},
   url={https://openreview.net/forum?id=nY9nITZQjc}
}

@inproceedings{MIntRec,
   author = {Zhang, Hanlei and Xu, Hua and Wang, Xin and Zhou, Qianrui and Zhao, Shaojie and Teng, Jiayan},
   title = {MIntRec: A New Dataset for Multimodal Intent Recognition},
   year = {2022},
   booktitle = {Proceedings of the 30th ACM International Conference on Multimedia},
   pages = {1688–1697},
}

@inproceedings{UMC,
    title = "Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances",
    author = "Zhang, Hanlei and Xu, Hua and Long, Fei and Wang, Xin and Gao, Kai",
    booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    year = "2024",
    url = "https://aclanthology.org/2024.acl-long.2",
    doi = "10.18653/v1/2024.acl-long.2",
    pages = "18--35",
}

@inproceedings{TCL-MAP,
  title={Token-level contrastive learning with modality-aware prompting for multimodal intent recognition},
  author={Zhou, Qianrui and Xu, Hua and Li, Hao and Zhang, Hanlei and Zhang, Xiaohan and Wang, Yifan and Gao, Kai},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={15},
  pages={17114--17122},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MultiModal Intent Analysis (MMIA)

Updates 🔥 🔥 🔥

Features

Datasets

Integrated Models

Results

Quick start

Extensibility

a. How to add a new dataset?

b. How to add a new backbone?

c. How to add a new method?

Citations

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
backbones		backbones
configs		configs
data		data
evaluation		evaluation
examples		examples
methods		methods
results		results
utils		utils
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

thuiar/MMIA

Folders and files

Latest commit

History

Repository files navigation

MultiModal Intent Analysis (MMIA)

Updates 🔥 🔥 🔥

Features

Datasets

Integrated Models

Results

Quick start

Extensibility

a. How to add a new dataset?

b. How to add a new backbone?

c. How to add a new method?

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages