v1.13.2: Patch release

regisss released this 06 Sep 20:17

· 287 commits to main since this release

1266993

Llava(-next) improvements

This patch release adds multi-card support for Llava(-next) and enables users to turn on/off recomputing for flash attention.

Llava: Added flash_attention_recompute arg to provide an option to enable/disable recompute #1278 @tthakkal
Add the deepspeed injection_policy of mistral #1309 @yuanwu2017

Full Changelog: v1.13.1...v1.13.2

Contributors

yuanwu2017 and tthakkal

Assets 2