An implementation of the transformer-based language model for sentence rewriting tasks such as summarization, text simplification, paraphrase generation, style transfer, and grammatical error correction. The following figure shows the architecture overview. This model receives an input that joint original sentence and simplified sentence by special token <SEP>, which means the delimiter. Then, the model generates target sentences. This architecture is very simple, but have shown the great result in text summarization task and text simplification task.
This code are depend on the following.
- python==3.6.5
- pytorch==1.1.0
- torchtext==0.3.1
git clone https://github.com/t080/pytorch-translm.git
cd ./pytorch-translm
pip install -r requirements.txt
The dataset for fine-tuning must be a text file. The input sentence must be segmented to words by whitespace. If you want to use GPU, please set the option --gpu
.
python train.py pretrain \
--train ./path/to/train.txt \
--savedir ./checkpoints/pre-trained \
--gpu
The dataset for fine-tuning must be TSV format. The source sentences and target sentences must be segmented to words by whitespace. If you want to use GPU, please set the option --gpu
.
python train.py finetune \
--model ./checkpoints/pre-trained/checkpoint_best.pt \
--train ./path/to/train.tsv \
--valid ./path/valid.tsv \
--savedir ./checkpoints/fine-tuned \
--gpu
In the translation step, you must set the option --model
and --input
. You can set sentence length of the model's output using the --maxlen
option (default: 100 tokens).
python generate.py \
--model ./checkpoints/fine-tuned/checkpoint_best.pt \
--input ./path/to/test.txt \
--gpu
- Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser. "Sample Efficient Text Summarization Using a Single Pre-Trained Transformer." arXiv preprint arXiv:1905.08836 (2019).
- Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi. "Efficient Adaptation of Pretrained Transformers for Abstractive Summarization." arXiv preprint arXiv:1906.00138 (2019).