v0.15.3
What's Changed
- Update version.txt after 0.15.2 release by @loadams in #6615
- Clean up prefetched parameters by @tohtana in #6557
- AIO CPU Locked Tensor by @jomayeri in #6592
- reduce setting global variables to reduce torch compile graph breaks by @NirSonnenschein in #6541
- Add API to get devices of offload states by @tohtana in #6586
- Ignore reuse_dist_env by @tohtana in #6623
- Add API for updating ZeRO gradients by @tjruwase in #6590
- [compile] Show breakdown of graph break by @delock in #6601
- Accept btl_tcp_if_include option through launcher_args by @diskkid in #6613
- Add first Step in LR Schedulers by @jomayeri in #6597
- Support safetensors export by @xu-song in #6579
- add option to disable logger while compiling to avoid graph breaks by @ShellyNR in #6496
- Lock cache file of HF model list by @tohtana in #6628
- Add README Pipeline Status for Huawei Ascend NPU by @xuedinge233 in #6588
- Update torch version in workflows by @tohtana in #6631
- Use file store for tests by @tohtana in #6632
- Fix Memory Leak In AIO by @jomayeri in #6630
- [XPU] upgrade xpu max1100 CI workflow to pytorch2.3 by @Liangliang-Ma in #6646
- [XPU] host timer check version from Torch 2.5 to Torch 2.6 by @YizhouZ in #6633
- [XPU] [DeepNVMe] use same cpu_op_desc_t with cuda by @Liangliang-Ma in #6645
New Contributors
Full Changelog: v0.15.2...v0.15.3