Skip to content

Latest commit

 

History

History
32 lines (24 loc) · 1.29 KB

MASS.md

File metadata and controls

32 lines (24 loc) · 1.29 KB

MASS Configuration

Arguments

It includes 4 additional arguments:

-ms_steps 'en,fr' MASS predictions steps, only for monolingual corpus
-lambda_ms 1.0 MS coefficient, default as 1.0
-word_mass 0.25 the ratio of masked segment of MASS
-min_len 0 remove the sentences whose lengths <= min_len, only pre-training stage will be used

For each training stage, you must specific the data-path generated by yourself.

use MASS for pre-training stage

./nmt_pretrained_with_mass.sh (single-gpu)
./nmt_pretrained_with_mass_multigpu.sh (multi-gpu)

use MASS + mlm (mlm for encoder) for pre-training stage

./nmt_pretrained_with_mass+mlm.sh (single-gpu)
./nmt_pretrained_with_mass+mlm_multigpu.sh (multi-gpu)

use MASS + clm (clm for decoder) for pre-training stage

./nmt_pretrained_with_mass+clm.sh (single-gpu)
./nmt_pretrained_with_mass+clm_multigpu.sh (multi-gpu)

For bach-translation stage, you should additional provide the path of checkpoint in pre-training stage.

use back-translation

./nmt_unsupervised_with_bt.sh (single-gpu)
./nmt_unsupervised_with_bt_multigpu.sh (multi-gpu)

use DAE + back-translation

./nmt_unsupervised_with_bt+dae.sh (single-gpu)
./nmt_unsupervised_with_bt+dae_multigpu.sh (multi-gpu)