v0.3.2
Changes
- llama.cpp updated to 84e09a7d8bc4ab6d658b5cd81295ac0add60be78
Noticeable increase in speed for 3B models on iOS with Metal - QKK_64 Build Can be used for quantization 3B models with k_quants
See more details here - Add
reverse prompt
option to stop prediction - Add predict time to message