Skip to content

v0.1.0

Compare
Choose a tag to compare
@XkunW XkunW released this 24 Apr 20:21
· 163 commits to main since this release
0784588

Easy-to-use high-throughput LLM inference on Slurm clusters using vLLM

Supported models and variants:

  • Command R plus
  • DBRX: Instruct
  • Llama 2: 7b, 7b-chat, 13b, 13b-chat, 70b, 70b-chat
  • Llama 3: 8B, 8B-Instruct, 70B, 70B-Instruct
  • Mixtral: 8x7B-Instruct-v0.1, 8x22B-v0.1, 8x22B-Instruct-v0.1

Supported functionalities:

  • Completions and chat completions
  • Logits generation