-
Notifications
You must be signed in to change notification settings - Fork 51
Models
The models are independently implemented as single .py
files and can be found under
nmtpytorch/models
A model implements a set of methods that can be seen in the basic NMT
model. In order to
implement a new model, you have two options:
- Derive your
class
from theNMT
class (SeeMNMTDecinit
) - Copy
nmt.py
under a different filename and rewrite all the methods. This is suitable if your model is substantially different than the basicNMT
model and there's no interest in deriving from it.
After creating your model, add the necessary import
into nmtpytorch/models/__init__.py
. The class
name of your model is what allows nmtpy
to import and use it during training and inference.
To sum up,
- Implement your model as a
class
calledMyModel
undernmtpytorch/models/mymodel.py
- Import it inside
nmtpytorch/models/__init__.py
asfrom .mymodel import MyModel
- Create an experiment configuration file and set
model_type: MyModel
inside it.
A Conditional-GRU based NMT similar to the dl4mt-tutorial architecture.
Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention." International Conference on Machine Learning. 2015.
Caglayan, Ozan, Loïc Barrault, and Fethi Bougares. "Multimodal attention for neural machine translation." arXiv preprint arXiv:1609.03976 (2016).
This model uses raw image files as inputs and implements an end-to-end pipeline with CNN from torchvision
.
A modification of the above model that is less memory hungry as this uses pre-extracted convolutional CNN features instead of embedding the CNN inside.
Visually initialized conditional-GRU variant from:
Caglayan, Ozan, et al. "LIUM-CVC Submissions for WMT17 Multimodal Translation Task." Proceedings of the Second Conference on Machine Translation. 2017.
nmtpytorch is developed in Informatics Lab / Le Mans University - France