Skip to content

Releases: Mozer/talk-llama-fast

0.0.4

09 Mar 17:46
f6a6706
Compare
Choose a tag to compare

New params:
--stop-words Stop1;stopword2, stop words for llama, separated by semicolon ;
--min-tokens N, the minimum number of tokens in the response, if it is less, then llama removes the found stop word, increases the temperature for one token and generates further. Useful for Russian language; the answers in Russian RP are usually very short.
--split-after N, split the first sentence after N tokens and immediately send to xtts. Relevant for large and slow models, for example mixtral.
--seqrep, prevents loops. Searches for the latest 20 characters in the last 300 characters. If it finds it, it deletes it, raises the temperature for one token and generates it further.
--xtts-intro, say a random Mmm/Well/... using xtts right after user input. Relevant for large and slow models, for example mixtral.

0.0.3

28 Feb 19:41
7d1f63b
Compare
Choose a tag to compare
  • Multiple characters (--multi-chars --allow-newline)
  • Inline LLM en_ru translation inside the same llama context (--allow-newline --translate)

0.0.2

23 Feb 12:37
3c1c809
Compare
Choose a tag to compare

pre alpha 0.0.2