TEXTOIR is the first high-quality Text Open Intent Recognition platform. This repo contains a convenient toolkit with extensible interfaces, integrating a series of state-of-the-art algorithms of two tasks (open intent detection and open intent discovery). We also release the pipeline framework and the visualized platform in the repo TEXTOIR-DEMO.
TEXTOIR aims to provide a convenience toolkit for researchers to reproduce the related text open classification and clustering methods. It contains two tasks, which are defined as open intent detection and open intent discovery. Open intent detection aims to identify n-class known intents, and detect one-class open intent. Open intent discovery aims to leverage limited prior knowledge of known intents to find fine-grained known and open intent-wise clusters. Related papers and codes are collected in our previous released reading list.
Date | Announcements |
---|---|
12/2023 | 🎆 🎆 New paper and SOTA in Open Intent Discovery. Refer to the directory USNID for the codes. Read the paper -- A Clustering Framework for Unsupervised and Semi-supervised New Intent Discovery (Published in IEEE TKDE 2023). |
04/2023 | 🎆 🎆 New paper and SOTA in Open Intent Detection. Refer to the directory DA-ADB for the codes. Read the paper -- Learning Discriminative Representations and Decision Boundaries for Open Intent Detection (Published in IEEE/ACM TASLP 2023). |
09/2021 | 🎆 🎆 The first integrated and visualized platform for text Open Intent Recognition TEXTOIR has been released. Refer to the directory TEXTOIR-DEMO for the demo codes. Read our paper TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition (Published in ACL 2021). |
05/2021 | New paper and baselines DeepAligned in Open Intent Discovery have been released. Read our paper Discovering New Intents with Deep Aligned Clustering (Published in AAAI 2021). |
05/2021 | New paper and baselines ADB in Open Intent Detection have been released. Read our paper Deep Open Intent Classification with Adaptive Decision Boundary (Published in AAAI 2021). |
05/2020 | New paper and baselines CDAC+ in Open Intent Discovery have been released. Read our paper Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement (Published in AAAI 2020). |
07/2019 | New paper and baselines DeepUNK in Open Intent Detection have been released. Read our paper Deep Unknown Intent Detection with Margin Loss (Published in ACL 2019). |
We strongly recommend you to use our TEXTOIR toolkit, which has standard and unified interfaces (especially data setting) to obtain fair and persuable results on benchmark intent datasets!
Datasets | Source |
---|---|
BANKING | Paper |
OOS / CLINC150 | Paper |
StackOverflow | Paper |
Model Name | Source | Published |
---|---|---|
OpenMax* | Paper Code | CVPR 2016 |
MSP | Paper Code | ICLR 2017 |
DOC | Paper Code | EMNLP 2017 |
DeepUnk | Paper Code | ACL 2019 |
SEG | Paper Code | ACL 2020 |
ADB | Paper Code | AAAI 2021 |
(K+1)-way | Paper Code | ACL 2021 |
MDF | Paper Code | ACL 2021 |
ARPL* | Paper Code | IEEE TPAMI 2022 |
KNNCL | Paper Code | ACL 2022 |
DA-ADB | Paper Code | IEEE/ACM TASLP 2023 |
Setting | Model Name | Source | Published |
---|---|---|---|
Unsupervised | KM | Paper | BSMSP 1967 |
Unsupervised | AG | Paper | PR 1978 |
Unsupervised | SAE-KM | Paper | JMLR 2010 |
Unsupervised | DEC | Paper Code | ICML 2016 |
Unsupervised | DCN | Paper Code | ICML 2017 |
Unsupervised | CC | Paper Code | AAAI 2021 |
Unsupervised | SCCL | Paper Code | NAACL 2021 |
Unsupervised | USNID | Paper Code | IEEE TKDE 2023 |
Semi-supervised | KCL* | Paper Code | ICLR 2018 |
Semi-supervised | MCL* | Paper Code | ICLR 2019 |
Semi-supervised | DTC* | Paper Code | ICCV 2019 |
Semi-supervised | CDAC+ | Paper Code | AAAI 2020 |
Semi-supervised | DeepAligned | Paper Code | AAAI 2021 |
Semi-supervised | GCD | Paper Code | CVPR 2022 |
Semi-supervised | MTP-CLNN | Paper Code | ACL 2022 |
Semi-supervised | USNID | Paper Code | IEEE TKDE 2023 |
(* denotes the CV model replaced with the BERT backbone)
- Use anaconda to create Python (version >= 3.6) environment
conda create --name textoir python=3.6
conda activate textoir
- Install PyTorch (Cuda version 11.2)
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch -c conda-forge
- Clone the TEXTOIR repository, and choose the task (Take open intent detection as an example).
git clone [email protected]:thuiar/TEXTOIR.git
cd TEXTOIR
cd open_intent_detection
- Install related environmental dependencies
pip install -r requirements.txt
- Run examples (Take ADB as an example)
sh examples/run_ADB.sh
- Note that if you cannot download the pre-trained model directly from HuggingFace transformers, you need to download it yourself. We provide the pre-trained bert model in the following link:
Baidu Cloud Drive with code: v8tk
This toolkit is extensible and supports adding new methods, datasets, configurations, backbones, dataloaders, losses conveniently. More detailed information can be seen in the tutorials of the directories open_intent_detection and open_intent_discovery.
If this work is helpful, or you want to use the codes and results in this repo, please cite the following papers:
- TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition
- Learning Discriminative Representations and Decision Boundaries for Open Intent Detection
- A Clustering Framework for Unsupervised and Semi-supervised New Intent Discovery
@inproceedings{zhang-etal-2021-textoir,
title = "{TEXTOIR}: An Integrated and Visualized Platform for Text Open Intent Recognition",
author = "Zhang, Hanlei and Li, Xiaoteng and Xu, Hua and Zhang, Panpan and Zhao, Kang and Gao, Kai",
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations",
pages = "167--174",
year = "2021",
url = "https://aclanthology.org/2021.acl-demo.20",
doi = "10.18653/v1/2021.acl-demo.20",
}
@article{DA-ADB,
title = {Learning Discriminative Representations and Decision Boundaries for Open Intent Detection},
author = {Zhang, Hanlei and Xu, Hua and Zhao, Shaojie and Zhou, Qianrui},
journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
volume = {31},
pages = {1611-1623},
year = {2023},
doi = {10.1109/TASLP.2023.3265203}
}
@ARTICLE{USNID,
author={Zhang, Hanlei and Xu, Hua and Wang, Xin and Long, Fei and Gao, Kai},
journal={IEEE Transactions on Knowledge and Data Engineering},
title={A Clustering Framework for Unsupervised and Semi-supervised New Intent Discovery},
year={2023},
doi={10.1109/TKDE.2023.3340732}
}
Hanlei Zhang, Shaojie Zhao, Xin Wang, Ting-En Lin, Qianrui Zhou, Huisheng Mao.
If you have any questions, please open issues and illustrate your problems as detailed as possible. If you want to integrate your method in our repo, please feel free to pull request!