TorontoCL at CMCL 2021 Shared Task

This repository contains the code for the TorontoCL submission to the CMCL 2021 Shared Task on eye tracking prediction. We fine-tune RoBERTa-base with a custom token-level regression head, and leverage data from the Provo eye tracking corpus for task-adaptive pretraining prior to fine-tuning. Our model ranked 3rd place out of 13 teams in the competition.

Team: Bai Li, Frank Rudzicz.

Instructions to run

Run single model

PYTHONPATH=. python scripts/run_roberta.py --mode=submission --num-ensembles=1 --use-provo=True

Run ensemble of 10 models

PYTHONPATH=. python scripts/run_roberta.py --mode=submission --num-ensembles=10 --use-provo=True
PYTHONPATH=. python scripts/ensemble.py

Notebooks

ProvoProcess.ipynb: preprocesses the Provo data to have a similar form as ZuCo training data.
MedianBaseline.ipynb: implements median, linear regression, and SVR baselines.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TorontoCL at CMCL 2021 Shared Task

Instructions to run

Notebooks

About

Releases

Packages

Languages

SPOClab-ca/cmcl-shared-task

Folders and files

Latest commit

History

Repository files navigation

TorontoCL at CMCL 2021 Shared Task

Instructions to run

Notebooks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages