Releases: Vali-98/ChatterUI
Releases · Vali-98/ChatterUI
v0.7.9b
v0.7.9a
v0.7.10-beta1
Added Gemma 2 2B support
v0.7.9
v0.7.9
The Local Upgrade
Warning: This update attempts to load files from app assets which may fail as I have not yet tested this on multiple devices. Please report if you get stuck in a boot crash! This update is generally very experimental with a few changes to the core c++ code of llama.rn, so it may be unstable.
Features:
- Local generation has migrated to cui-llama.rn , a fork of the fantastic llama.rn project, but with custom features tailored for ChatterUI:
- Added stopping prompt processing between batches - more effective when used with low batch size.
- vocab_only mode which allows for tokenizer only usage - this also removes the need for onnx-runtime and the old transformer.js adaptation for tokenizers, cutting down app size significantly!
- Synchronous tokenization for ease of development
Context Shifting
adapted from kobold.cpp (Thanks @LostRuins) - this allows you to use high context chats without needing to reprocess the entire context upon hitting context limit
- Added support for
i8mm
compatible devices (Snapdragon 8 Gen 1 or newer / Exynos 2200 or newer)- This feature allows the use of Q4_0_4_8 quantization levels optimized for ARM devices.
- It is recommended to requantize your models to this quantization level using the llama.cpp quantize tool:
.\llama-quantize.exe --allow-requantize model.gguf Q4_0_4_8
Changes:
- Local inferencing is now done as a background task! This should mean that tabbing out of the app should not stop inferencing.
- Buttons in Local API menu now properly disable based on model state
- The internal tokenizer now relies entirely on the much faster implementation in cui-llama.rn. As such the previous JS tokenizer has been removed alongside onnx-runtime, leading to much smaller APK size.
Fixes:
- Continuing with local API now properly respects the regenCache
- removed BOS token from default Llama 3 instruct preset
Dev:
- Moved
constants
andcomponents
underapp/
, as this seems to affect react-native's Fast Refresh functionality significantly - Moved local api state to zustand this helps a lot with fast refresh bugginess in development and prevents the model state from being unloaded upon a refresh
v0.7.9-beta5
Updated cui-llama.rn with Context Shift
v0.7.9-beta4
Test build for i8mm
instruction, providing x2-3 faster prompt processing on modern Android devices.
v0.7.9-beta3
Test for sync with llamacpp
v0.7.9-beta2-unstable
WARNING: This build may break your install.
Testing new tokenizer system.
ChatterUI_0.7.9-beta1
Experimental build with cui-llama.rn
v0.7.8
v0.7.8
Features:
- Added a 'submit on enter' settings option for text input.
- Added option to save KV cache with the Local API. This will save the current local context to storage after every generation, allowing you to pick up where you last left off and negate reprocessing context. The cache will reset on new chats or swapping models.
- WARNING: This feature uses a LOT of memory and will write several MB/GB every message based on model size. This may incur high battery drain and performance degredation on older devices. Because of this, the option is hidden away in the settings menu rather than being placed in the Local API tab.
- Added Sonnet 3.5 to Claude models.
- Added Autoformat New Chats feature to format inputs to preferred format, useful for formatting cards obtained from third parties.
- This option will automatically detect the existing format for new chat greetings, then will convert to the selected format:
- Mode 0 - None
- Mode 1 -
PlainActionQuoteSpeech
-This is an action. "This is some speech."
- Mode 2 -
AsteriskActionPlainSpeech
-*This is an action.* This is some speech.
- Mode 3 -
AsteriskActionQuoteSpeech
-*This is an action.* "This is some speech."
- DISCLAIMER: This feature uses some pretty wacky regex to work, so kindly report bad conversions alongside the problematic text.
- Added option to toggle usage of Example Messages.
- Added option to add a timestamp to messages.
Changes:
- Improved the UX of numerous long buttons with side buttons to be less prone to accidental clicks. These buttons are now properly segmented instead of being nested.
- Removed animation from editor due to buggy behavior with several keyboards. As such, the relevant option in Settings has also been hidden.
- UX improvements in the Local API menu - buttons now disable based on app/model state.
- Changed base-64 library used to a faster alternative.
Fixes:
- Fixed Claude compatiblity.
- Fixed Claude API not taking custom first message.
- Fixed issues with local auto-loading not respecting params such as context size and thread count.
- Fixed Local generations not printing last token.
- Fixed Horde API calls missing headers.
- Merged fixes by @Henri-J-Norden for the OpenAI API. (Thanks!)
Dev stuff:
- [Dev] Refactored naming schema for functions which operate on the database
- [Dev] Added automatic workflow to trigger builds upon push