0.1.0
- Paged attention support (requries flash-attn>=2.5.7)
- New generator with dynamic batching support (requires paged attn)
- Examples updated for dynamic generator
- Faster draft model SD
- Various optimizations, bugfixes and QoL improvements
Full Changelog: v0.0.21...v0.1.0