Sentence level quality estimation

This repository is for the Sentence-level QE task 2020 Codalab competition. To recreate our results one can run the EnglishToChinese notebook. A baseline notebook has been provided as well.

Task description

The shared task on Quality Estimation aims to examine automatic methods for estimating the quality of machine translation output at run-time, without relying on reference translations.This variant looks at sentence-level prediction, where participating systems are required to score each translated sentence according to direct assessments (DA) on their quality. The DA score is a number in 0-100 which has been given by humans, where 0 is the lowest possible quality and 100 is a perfect translation. For the task, we collected three annotations per instance and use the average of a z-standardised (per annotator) version of the raw score

Scores

Our best model is an RNN using GRU cells (saved as modelGRU) which scored the following on the held out dataset:

Pearson: 0.3001
MAE: 0.7024
RMSE: 0.8887

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
.gitignore		.gitignore
EnglishToChinese.ipynb		EnglishToChinese.ipynb
README.md		README.md
baseline.ipynb		baseline.ipynb
modelGRU.pt		modelGRU.pt
predictions.txt		predictions.txt
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentence level quality estimation

Task description

Scores

About

Releases

Packages

Contributors 2

Languages

gml16/sentence-level-quality-estimation

Folders and files

Latest commit

History

Repository files navigation

Sentence level quality estimation

Task description

Scores

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages