Doc2Doc_NMT

The repository for the paper: Rethinking Document-level Neural Machine Translation (ACL-2022: Findings)

Other previously used titles:

- Capturing Longer Context for Document-level Neural Machine Translation: A Multi-resolutional Approach)

- An Empirical Study of Document-to-document Neural Machine Translation

Training sets

The training sets can be downloaded from here.

Test sets

The test sets are organized in sentences(sent/del) and documents(doc) respectively with the same content. The labeled tokens and their positions are in the testsets/doc/en.candidates.

Calculate TCP

As is mentioned in the paper, we provide the python script of calculating TCP, as:

python3 tcp.py python tcp.py --hypotheses_dir your_hypothesis_or_rootpath

It is equivalent to

python3 tcp.py python tcp.py --reference ./testsets/doc/en.tok --candidates ./testsets/doc/en.candidates --hypotheses_dir your_hypothesis_or_rootpath

Cititaion

If you use our data or evaluation scripts, please cite:

@inproceedings{sun2020rethinking,
  title={Rethinking Document-level Neural Machine Translation},
  author={Zewei Sun and Mingxuan Wang and Hao Zhou and Chengqi Zhao and Shujian Huang and Jiajun Chen and Lei Li},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2022},
  year={2022},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
pdc-sample		pdc-sample
testsets		testsets
README.md		README.md
tcp.py		tcp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Doc2Doc_NMT

Training sets

Test sets

Calculate TCP

Cititaion

About

Releases

Packages

Languages

sunzewei2715/Doc2Doc_NMT

Folders and files

Latest commit

History

Repository files navigation

Doc2Doc_NMT

Training sets

Test sets

Calculate TCP

Cititaion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages