v1.13.2: Patch release
Llava(-next) improvements
This patch release adds multi-card support for Llava(-next) and enables users to turn on/off recomputing for flash attention.
- Llava: Added flash_attention_recompute arg to provide an option to enable/disable recompute #1278 @tthakkal
- Add the deepspeed injection_policy of mistral #1309 @yuanwu2017
Full Changelog: v1.13.1...v1.13.2