v1.1.0: GPT2, T5 and SynapseAI 1.5.0
GPT2
You can now train or fine-tune GPT2 for causal language modeling on up to 8 HPUs. An example of fine-tuning on WikiText-2 is provided here.
- Add support for language modeling (GPT2) #52
You can also use GPT2 for text generation in lazy mode.
- Accelerate generation #61
T5
Encoder-decoder architectures are now supported. In particular, examples relying on T5 for the following tasks are available:
- summarization, with an example of fine-tuning T5 on the CNN/DailyMail dataset,
- translation, with an example of fine-tuning T5 on the WMT16 dataset for translating English to Romanian.
You can also use T5 for text generation in lazy mode.
- Accelerate generation #61
Support for SynapseAI 1.5.0
The newly released SynapseAI 1.5.0 is now supported. You can find more information about it here.
- Add support for SynapseAI 1.5.0 #65
This is a breaking change, you should update your version of SynapseAI as described here in order to use this new release.
GaudiConfig instantiation is not mandatory anymore
If the name of your Gaudi configuration is given in the training arguments, you do not have to instantiate it and provide it to the trainer anymore. This will be automatically taken care of. You can still instantiate a Gaudi configuration and provide it to the trainer.
- Enable GaudiConfig instantiation from inside the trainer #55
Refined throughput computation in lazy mode
In lazy mode, the first two steps are warmup steps used for graph compilation. In order to discard them from the throughput computation, you can just add the following training argument: --throughput_warmup_steps 2
.
- Add a new argument for taking warmup steps into account in throughput computation #48