Scripts for fine-tuning pretrained language models on custom data sets, e.g.
- for fine-tuning Sci-Bert on the COVID-19 Open Research Dataset (CORD-19) using the Hugging Face Transformer library.
- for fine-tuning Bert on the ACL Anthology Reference Corpus.
Examples are provided for using the models (SciBERT, SciBERT fine-tuned, and BERT-"original") for extractive summarization (BERT) and text-generation (GPT-2).
Copyright for papers belongs to ACL. Adaptations of original notebooks by Chris Callison-Burch and Derek Miller.