Marian 1.4.0

emjotde released this 13 Mar 23:30

· 4932 commits to master since this release

[1.4.0] - 2018-03-13

Added

Data weighting with --data-weighting at sentence or word level
Persistent SQLite3 corpus storage with --sqlite file.db
Experimental multi-node asynchronous training
Restoring optimizer and training parameters such as learning rate, validation
results, etc.
Experimental multi-CPU training/translation/scoring with --cpu-threads=N
Restoring corpus iteration after training is restarted
N-best-list scoring in marian-scorer

Fixed

Deterministic data shuffling with specific seed for SQLite3 corpus storage
Mini-batch fitting with binary search for faster fitting
Better batch packing due to sorting

[1.3.1] - 2018-02-04

Fixed

Missing final validation when done with training
Differing summaries for marian-scorer when used with multiple GPUs

[1.3.0] - 2018-01-24

Added

SQLite3 based corpus storage for on-disk shuffling etc. with --sqlite
Asynchronous maxi-batch preloading
Using transpose in SGEMM to tie embeddings in output layer

[1.2.1] - 2018-01-19

Fixed

Use valid-mini-batch size during validation with "translation" instead of mini-batch
Normalize gradients with multi-gpu synchronous SGD
Fix divergence between saved models and validated models in asynchronous SGD

Assets 2