Skip to content

SPOClab-ca/cmcl-shared-task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TorontoCL at CMCL 2021 Shared Task

This repository contains the code for the TorontoCL submission to the CMCL 2021 Shared Task on eye tracking prediction. We fine-tune RoBERTa-base with a custom token-level regression head, and leverage data from the Provo eye tracking corpus for task-adaptive pretraining prior to fine-tuning. Our model ranked 3rd place out of 13 teams in the competition.

Team: Bai Li, Frank Rudzicz.

Instructions to run

Run single model

PYTHONPATH=. python scripts/run_roberta.py --mode=submission --num-ensembles=1 --use-provo=True

Run ensemble of 10 models

PYTHONPATH=. python scripts/run_roberta.py --mode=submission --num-ensembles=10 --use-provo=True
PYTHONPATH=. python scripts/ensemble.py

Notebooks

  • ProvoProcess.ipynb: preprocesses the Provo data to have a similar form as ZuCo training data.
  • MedianBaseline.ipynb: implements median, linear regression, and SVR baselines.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published