Skip to content

Latest commit

 

History

History
225 lines (152 loc) · 16.1 KB

models.md

File metadata and controls

225 lines (152 loc) · 16.1 KB

Supported Models

Chat/Instruct Models

Base Models

Please use --format completion for these models.

RAG Models

LoRA Models

These LoRA models have been tested:

Special Models

  • Meta-AI multi-token prediction models checkpoints

    Download at least one multi-token prediction checkpoint (such as 7B_1T_4). Assume it is stored at /path/to/llama-multi-predict/7B_1T_4. Make sure tokenizer.model is downloaded to /path/to/llama-multi-predict.

    To convert it with -a llama-multi-token-prediction-ckpt:

    python convert.py -i /path/to/llama-multi-predict/7B_1T_4 -o llama-multi.bin -a llama-multi-token-prediction-ckpt

    This is a base model, and remember to use --format completion.

    Tip: Use --kv n_future_tokens N to change number of future tokens, N = [1, 4].