Releases: aws/sagemaker-training-toolkit
Releases · aws/sagemaker-training-toolkit
v4.2.6
Bug Fixes and Other Changes
- Enable PT XLA distributed training on homogeneous clusters
v4.2.5
Bug Fixes and Other Changes
- relax exception type
v4.2.4
prepare release v4.2.4
v4.2.3
Bug Fixes and Other Changes
- update num_processes_per_host for smdataparallel runner
v4.2.2
Bug Fixes and Other Changes
- Removed version hardcoding for sagemaker test dependency
- update distribution_instance_group for pytorch ddp
- specify flake8 config explicitly
v4.2.1
Bug Fixes and Other Changes
- handle utf-8 decoding exceptions while processing stdout and stderr streams
v4.2.0
Features
- Heterogeneous cluster changes
v4.1.6
Bug Fixes and Other Changes
- update: protobuf version to overlap with TF requirements
v4.1.5
Bug Fixes and Other Changes
- Fix none exception class issue for mpi
v4.1.4
Bug Fixes and Other Changes
- Use framework provided error class and stack trace as error message