For installation and docs please refer to release 0.6 of pytorch_pretrained_bert.
The current fork adds the jupyter notebook on the attention analysis of the 12 layer BERT model. For details please refer to the paper.
Note that for the extraction of attention weights the source code of was modified (this functionality was added in later realeases of the forked repo).
- Install the requirements as
pip install -r requirements.txt
-
The current implementation assumes you have GLUE datasets downloaded and fine-tuned BERT model weights saved to a directory of your choice. You can download GLUE data as described here. To fine-tune BERT, run .
-
The code for analysis is contained in the jupyter notebook.
-
To repeat the results of experiments, make sure to change the
path_to_model
andpath_to_data
in the notebook.
@article{kovaleva2019revealing,
title={Revealing the Dark Secrets of BERT},
author={Kovaleva, Olga and Romanov, Alexey and Rogers, Anna and Rumshisky, Anna},
journal={arXiv preprint arXiv:1908.08593},
year={2019}
}