v0.3.2

guinmoon released this 26 Jul 11:24

· 267 commits to main since this release

5785d1d

Changes

llama.cpp updated to 84e09a7d8bc4ab6d658b5cd81295ac0add60be78
Noticeable increase in speed for 3B models on iOS with Metal
QKK_64 Build Can be used for quantization 3B models with k_quants
See more details here
Add reverse prompt option to stop prediction
Add predict time to message

Assets 6