This repository contains code for the paper "It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations" (to be presented at ACL 2020).
Authors: Samson Tan, Shafiq Joty, Min-Yen Kan, and Richard Socher
UPDATE: pip installable library here!
To generate adversarial examples for one of the implemented models, run the corresponding run_morpheus_*
script.
Morpheus can be easily implemented for a custom task, dataset, or model by following the structure of existing classes:
MorpheusBase
implements the methods common to all Morpheus implementations; Morpheus<Task>
implements methods common to a specific task/dataset, Morpheus<Model><Task>
implements methods specific to a particular model (usually the init
and morph
methods).
Use random_inflect/random_inflect.py
to generate adversarial training data. You will need to pass in a dictionary of inflection counts for it to work in the weighted sampling mode, otherwise a uniform distribution will be used. The dictionary should be in the form
{
"inflection tag": int,
}
E.g.,
{
"VB": 150,
"VBD": 100,
...
}
- Transformer-big for WMT'14 English-French: Compatible with
fairseq
- BERT-base for MNLI: Compatible with
transformers
- BERT-base for SQuAD 2: Compatible with
transformers
@inproceedings{tan-etal-2020-morphin,
title = "It{'}s Morphin{'} Time! {C}ombating Linguistic Discrimination with Inflectional Perturbations",
author = "Tan, Samson and
Joty, Shafiq and
Kan, Min-Yen and
Socher, Richard",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.263",
pages = "2920--2935",
}