Skip to content

Releases: Mozer/talk-llama-fast

0.2.0

21 Jul 18:44
Compare
Choose a tag to compare
  1. Added support for gemma-2 and mistral-nemo.

  2. Added multiple gpu support. Don't set those 3 params if you have just 1 gpu.

--main-gpu 0 - set main gpu id with kv-cache: 0, 1, ...
--split-mode none - none or layer. split-mode tensor is not supported
--tensor-split 0.5,0.5 - how to split layers or tensors per gpus, array of floats.

  1. Added instruct mode with presets. It is optional and experimental. There are still some bugs.

--instruct-preset gemma where gemma is the name of the file \instruct_presets\gemma.json

Instruct mode helps to make responses longer and smarter. You can find correct instruct-preset for each model at the model card on huggingface or in sillytavern - formatting - instruct mode sequences.

Example dialogue in assisttant.txt should also be formatted using instruct mode tags. I added gemma and mistral instruct presets. And added some bats to run gemma and nemo in instruct mode.

  1. Added -debug to print whole context dialogue after each LLM response. Useful to see if there's something wrong with formatting.

0.1.8

26 Jun 17:58
e3e0d52
Compare
Choose a tag to compare

Added --min_p sampler param. I recommend using --min_p 0.10 for Russian. Default is 0.

0.1.7

09 May 16:57
a60ea8f
Compare
Choose a tag to compare
  • Added --push-to-talk option: hold "Alt" key to speak (useful with loudspeakers without headphones). Turned off by default.
  • And now you can use Cyrillic letters in bat files. Save them using Cyrillic "OEM 866" encoding, notepad++ supports it. (В bat файлах теперь можно использовать кириллицу. Для этого сохраните bat файл в кодировке "OEM 866" в приложении notepad++: Encoding -> Character sets -> Cyrillic -> OEM 866).
  • Added talk-llama-fast-v0.1.7_no_avx2.zip for old CPU's without AVX2 instructions (e.g. Intel i5-2500K). May be a little slower. Use it if main version crashes without an error.

0.1.6

30 Apr 19:21
1553710
Compare
Choose a tag to compare

-bug fix with start prompt:

start prompt was not written correctly into context when running with default --batch-size 64 parameter or without it. Llama couldn't remember anything from the start prompt (just first 64 tokens). This bug first came in v0.1.4 and no one noticed.

0.1.5

25 Apr 18:47
39ab5ce
Compare
Choose a tag to compare

New features:

  • Keyboard input (finally you can type messages using keyboard now).
  • You can copy and paste text into talk-llama-fast window.
  • Hotkeys: Stop(Ctrl+Space), Regenerate(Ctrl+Right), Delete(Ctrl+Delete), Reset(Ctrl+R).
  • Bug fix: Reset command works fine now, no bugs with long context.

New bugs:

  • Sometimes when you type fast, the first letter of your message is not typed. (Type a little slower. I have to investigate more, what is causing this bug).
  • When you paste more than 1 passage of text into llama, the last passage has to have \n (new line symbol) at the end, otherwise you have to hit Enter to paste this last passage into llama console window.

0.1.4

17 Apr 18:29
4a55f84
Compare
Choose a tag to compare

New params:

  • --batch-size (default 64) - process start prompt and user input in batches. With 64 llama takes 0.6 GB less VRAM than it was before with 1024. 64 is fine for small and fast models, for big LLMs try larger values, e.g. 512.

  • --verbose to show speed in tps for testing. Use with --sleep-before-xtts 0 or it will be showing incorrect time including sleep time.

  • Start prompt is now not limited in length. But keep start prompt length < --ctx_size. Otherwise llama will be several times slower (bug). Increase ctx_size if needed (takes more vram).

  • Stop words now support \n and \r symbols. E.g.: --stop-words "Alexej:;---;assistant;===;**;Note:;\n\n;\r\n\r\n;\begin;\end;###; (;["

0.1.3

07 Apr 18:30
b5ad85c
Compare
Choose a tag to compare
  • removed --xtts-control-path param. Now it is not needed anymore. xtts control file is stored in temp dir. No need to set xtts_play_allowed_path it in extras.
  • added missing default.wav voice

No other changes.

To make this version work - please update xtts_api_server, tts, and wav2lip if you have previous versions installed. (pip install with each, as in readme)

0.1.2

06 Apr 20:01
193059d
Compare
Choose a tag to compare

Now it's using 2 condas to install
updated bats

0.1.1

06 Apr 11:23
44831a4
Compare
Choose a tag to compare

Now with wav2llip!
Usage demonstration in Russian (видео на русском): https://youtu.be/ciyEsZpzbM8
English video is coming soon.

changes:
path fixes in xtts .bats

0.1.0

04 Apr 17:57
9db2c8a
Compare
Choose a tag to compare

With streaming wav2lip.
Video is coming in a next few days.