Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

This PyTorch package implements the KEAR model that surpasses human on the CommonsenseQA benchmark, as described in:

Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng and Xuedong Huang
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
arXiv:2112.03254, 2021

The package also includes codes for our earilier DEKCOR model as in:

Yichong Xu∗, Chenguang Zhu∗, Ruochen Xu, Yang Liu, Michael Zeng and Xuedong Huang
Fusing Context Into Knowledge Graph for Commonsense Question Answering
Findings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021

Please cite the above papers if you use this code.

Results

This package achieves the state-of-art performance of 86.1% (single model), 89.4% (ensemble) on the CommonsenseQA leaderboard, surpassing the human performance of 88.9%.

Quickstart

pull docker:
> docker pull yichongx/csqa:human_parity
run docker
> nvidia-docker run -it --mount src='/',target=/workspace/,type=bind yichongx/csqa:human_parity /bin/bash
> cd /workspace/path/to/repo
Please refer to the following link if you first use docker: https://docs.docker.com/

Features

Our code supports flexible training of various models on multiple choice QA.

Distributed training with Pytorch native DDP or Deepspeed: see bash/task_train.sh
Pause and resume training at any step; use option --continue_train
Use any transformer encoders including ELECTRA, DeBERTa, ALBERT

Preprocessing data

Pre-processed data is located at data/.

We release codes for knowledge graph and dictionary external attention in preprocess/

Download data
> cd preprocess
> bash download_data.sh
Add ConceptNet triples and Wiktionary definitions to data
> python add_knowledge.py
We also add the most frequent relations in each question as a side information.
> python add_freq_rel.py

Training and Prediction

train a model
> bash bash/task_train.sh
make prediction
> bash bash/task_predict.sh See task.py for available options.

Running codes for DEKCOR

The current code is mostly compatible to run DEKCOR. To run the original DEKCOR code, please checkout tag DEKCOR to use the previous version.

by Yichong Xu
[email protected]

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
bash		bash
data		data
ds_configs		ds_configs
model		model
preprocess		preprocess
specific		specific
utils		utils
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
NOTICE.md		NOTICE.md
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
task.py		task.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

Results

Quickstart

Features

Preprocessing data

Training and Prediction

Running codes for DEKCOR

Contributing

Trademarks

About

Releases

Packages

Languages

shuohangwang/KEAR

Folders and files

Latest commit

History

Repository files navigation

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

Results

Quickstart

Features

Preprocessing data

Training and Prediction

Running codes for DEKCOR

Contributing

Trademarks

About

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages