Skip to content
forked from microsoft/KEAR

Official code for achieving human parity on CommonsenseQA with External Attention

Notifications You must be signed in to change notification settings

shuohangwang/KEAR

 
 

Repository files navigation

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

This PyTorch package implements the KEAR model that surpasses human on the CommonsenseQA benchmark, as described in:

Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng and Xuedong Huang
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
arXiv:2112.03254, 2021

The package also includes codes for our earilier DEKCOR model as in:

Yichong Xu∗, Chenguang Zhu∗, Ruochen Xu, Yang Liu, Michael Zeng and Xuedong Huang
Fusing Context Into Knowledge Graph for Commonsense Question Answering
Findings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021

Please cite the above papers if you use this code.

Results

This package achieves the state-of-art performance of 86.1% (single model), 89.4% (ensemble) on the CommonsenseQA leaderboard, surpassing the human performance of 88.9%.

Quickstart

  1. pull docker:
    > docker pull yichongx/csqa:human_parity

  2. run docker
    > nvidia-docker run -it --mount src='/',target=/workspace/,type=bind yichongx/csqa:human_parity /bin/bash
    > cd /workspace/path/to/repo
    Please refer to the following link if you first use docker: https://docs.docker.com/

Features

Our code supports flexible training of various models on multiple choice QA.

  • Distributed training with Pytorch native DDP or Deepspeed: see bash/task_train.sh
  • Pause and resume training at any step; use option --continue_train
  • Use any transformer encoders including ELECTRA, DeBERTa, ALBERT

Preprocessing data

Pre-processed data is located at data/.

We release codes for knowledge graph and dictionary external attention in preprocess/

  1. Download data
    > cd preprocess
    > bash download_data.sh
  2. Add ConceptNet triples and Wiktionary definitions to data
    > python add_knowledge.py
  3. We also add the most frequent relations in each question as a side information.
    > python add_freq_rel.py

Training and Prediction

  1. train a model
    > bash bash/task_train.sh
  2. make prediction
    > bash bash/task_predict.sh See task.py for available options.

Running codes for DEKCOR

The current code is mostly compatible to run DEKCOR. To run the original DEKCOR code, please checkout tag DEKCOR to use the previous version.

by Yichong Xu
[email protected]

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

About

Official code for achieving human parity on CommonsenseQA with External Attention

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 95.5%
  • Shell 2.6%
  • Dockerfile 1.9%