Skip to content

v0.3.2

Compare
Choose a tag to compare
@guinmoon guinmoon released this 26 Jul 11:24
· 267 commits to main since this release

Changes

  • llama.cpp updated to 84e09a7d8bc4ab6d658b5cd81295ac0add60be78
    Noticeable increase in speed for 3B models on iOS with Metal
  • QKK_64 Build Can be used for quantization 3B models with k_quants
    See more details here
  • Add reverse prompt option to stop prediction
  • Add predict time to message